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Figure 4. Schematic diagram of the Lama2/APPA construct. 
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Figure 5. The nucleic acid sequence of the Lama2/APPA plasmid (SEP ID NO: \) 



LOCUS 

DEFINITION 
ACCESSION 
KEYWORDS 



REFERENCE 

AUTHORS 

JOURNAL 

FEATURES 

DEFINITION 

ACCESSION 
VERSION 
SOURCE 
ORGANISM 



Lama-appA 20623 bp DNA CIRCULAR SYN 17-JAN-2000 
Lama 2/APPA transgenic construct 
Lama 2-appA, 

parotid secretory protein; acid glucose- 1 -phosphatase ; appA 
gene ; 

periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence; 
cloning vector 
1 (bases 1 to 20623) 

Golovan, S . , Forsberg, C.W. , Phillips, J. 
Unpublished. 

K. musculus Psp gene for parotid secretory protein. 
X68699 

X68699.1 GI:53809 
house mouse . 
Mus musculus 



Eukaryota; Metazoa; Chordata; Craniata ; Vertebrata; Mammalia; 
Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 3777 to 5332;) 

AUTHORS Svendsen,P., Laursen, J. , Krogh - Pedersen , H . and Hjorth, J.P. 
TITLE Novel salivary gland specific binding elements located in the PSP 

proximal enhancer core 



JOURNAL 

MEDLINE 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 



REFERENCE 
AUTHORS 



13952 to 17731) 



Nucleic Acids Res. 26 (11) , 2761-2770 (1998) 
9B256451 

2 (bases 7147 to 12653; 
Mikkelsen,T.R. 
Direct Submission 

Submitted (07-OCT-1992) T.R. MikJcelsen, Department of Molecular 
Biology, University of Aarhus, CF Mollers Alle 130, 8000 
Aarhus , DENMARK 
3 (bases 7147 to 12653; 139S2 to 17731) 
Laursen J, Hjorth JP 



TITLE A cassette for high-level expression in the mouse salivary glands. 



JOURNAL Gene 1997 Oct 1 ; 198 (1-2) : 367-72 
MEDLINE 9370303 



FEATURES 



misc feature 



enhancer 



misc feature 



Location/Qualif iers 
source 1 . to 12653; 13952 to 17731 

/organism= M Mus musculus* 1 
/strain="C3H/As" 
/ db_xr e f = " t axon : 10090" 
/ c hromo s ome ="2" 

/map- "Estimate: 69 cM from centromere" 
/clones "Lambda YP1, Lambda YP3 , Lambda YP7" 
/clone_lib="Lambda-PHAGE (Lambda L47.1)" 
/germline 
/note= n Allele: b" 

3777-5332 
/gene="PSP t ' 

/function=" salivary gland specific positive acting 
regulatory region" 
7147 . . 6724 

/evidence=experimental 
11778 . . 11824 
/gene="Psp" 
/note="exon a n 
/number=l 

/evidence^ experimental 
12626 . . 14190 
/gene= n Psp" 

/noce= " exon b fused with exons h and i" 
12644-12652 
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Figure 5 (continued): 

/function^" consensus sequence for initiation in higher 
eukaryotes M 
miscjeature 13 95 2-13 965 

/function=" M13mpl8 polylinker" 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase (appA) gene, 

ACCESSION M58708 L03370 L03371 £,03372 L03373 L03374 L03375 

VERSION M58708.1 GI: 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae; 

Escherichia. 



REFERENCE 1 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 

FEATURES 

Source 



sig_peptide 
/gene="appA' 
CDS126S3 



(bases 12653 13951) 
Dassa,J., Marck,C. and Boquet,P.L. 

The complete nucleotide sequence of the Escherichia coli gene appA 
reveals significant homology between pH 2.5 acid phosphatase 
and glucose- 1 -phosphatase 

J . Bacterid. 172 (9), 5497-5500 (1990) 

90368616 

Location/Qualifiers 

12653 . .13951 
/organism= "Escherichia coli" 
/ db_xr e f = " t axon : 5 € 2 " 
12653.. 12718 



13951 

/gene= n appA" 

/standard_name="acid phosphatase/phytase" 
/transl__table=ll 

/product= "periplasmic phosphoanhydride phosphohydrolase' 
/protein_id= M AAA72086 .1" 
/db xref="Gl :14 52 85" 



/translations "MKA1LIPFLSLLIPLTP0SAFAQSEPELKLESWIVSRHGVRAP 
TKATQLMQDVTPDAWPTWPVKLGWXiTPRGGEL I AYLGHYQRQRLVADGLLAKKGCPQS 
GQVAIIADVDERTRKTGEAFAAGLAPDCAITVHTOADTSSPDPLFNPLKTGVCQLDNA 
NVTDAILSRAGGSIADFTGHRQTAFRELERVLNFPQSNIiCLKREKQDESCSLTQALPS 
ELKVSADNVSLTGAVSLASMLTEIFLLQQAQGMPEPGWGRITDSHQWNTLLSLHNAQF 

YLLQRT PE VARSRATP LLDLI KTALTPHPP QKQAYGVTLP TSVLF I AGHDTNLANLGG 
ALELNWTLPGQPDNTPPGGELVFERWRRLSDNSQWIQVSLVFQTLQQMRDKTPLSLNT 

P PGEV KLTLAGCE ERNAQGMCS LAG FTQ I VNE ARI PACSL" 



mat_peptide 



mutation 



mutation 



mutation 



12719 13948 
/gene= "appA" 

/product = "periplasmic phosphoanhydride phosphohydrolase" 

replace (12659. . 12661, "gcg changed to gcc") 
/gene=" appA" 

/ s t andard_name = " A3 mutant" 

/note=" created by site directed mutagenesis" 
/citation= [3] 

/phenotype= "silent mutation" 

replace (13 934 . .13 936, " ccg changed to ccc") 
/gene="appA " 

/standard^ name=" P4 2 8 mutant" 

/note="created by site directed mutagenesis" 
/citation* [3] 

/phenotype=" silent mutation " 

replace (13937. .13939, " gcg changed to get") 

/gene= "appA" 

/standard_name=" A4 2 9 mutant" 

/note= "created by site directed mutagenesis" 
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Figure 5 (continued): 



DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
•JOURNAL 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

MEDLINE 

REFERENCE 

AUTHORS 

TITLE 

JOURNAL 

MEDLINE 

FEATURES 

Source 



CDS 



/citation^ [3] 

/phenotype= " silent mutation 



vector DNA, 



pBluescript II KS(+) 
X52327 

X52327.1 GI-.58061 

artificial sequence; cloning vector; expression vector ^ vector . 
synthetic construct - 
synthetic construct 
artificial sequence. 

1 (bases 17732 to 20623) 
Thomas ,E. A. 

Direct Submission 

Submitted £ 20 -FEB- 1990) Thomas E.A. , Stratagene Cloning 
Systems, 11099 North Torney Pines Rd. , La Jolla, CA 92037r^USA 

2 (bases 17732 to 20623) 

Short,J.M., Fernandez , J-M. , Sorge,J.A. and Huse,W.D. 

Lambda ZAP: a bacteriophage lambda expression vector with in 

vivo excision properties 

Nucleic Acids Res. 16 (IS), 7583-7600 (1988) 
88319944 

3 (bases 17732 to 20623) 
Alting-Mees , M. A. and Short, J. M. 
pBluescript II: gene mapping vectors 
Nucleic Acids Res. 17 (22) r 9494 (1989) 
90067967 

Location/ Qualifiers 
17732 to 20623 

/organise" synthetic construct" 
/ db_xre f = " t axon : 32630" 
complement (18967. .19827) 
/ gene « "Amp 11 

/product= "b- lactamase " 



BASE COUNT 
ORIGIN 

1 TCGAGAGTAT 
61 ATCTAAACTA 
121 TGTTGAACAA 
181 CTGAGGAGAC 
241 AGGGTGGTTC 
3 01 AAGCTACCCC 

3 61 GCCGGACAGT 
421 AGGGATTGAG 

4 81 ACAAAGCTGC 
541 ACAG CAT AAT 
601 ATAAAAGGAC 
661 TTTAAGTAGG 
721 GTCTCTTACT 
781 GGACAATATA 
841 CACCAAGACT 
901 GTGGTGGTGA 
961 CACACTGGAG 

1021 GCGGGGCGTG 
10 81 TCTGAGTTCC 
1141 AAAAACCCTG 
1201 ACCAAACCAA 
1261 TCCTAGATAT 
1321 AC T ACACTGT 

13 81 GGATAGGTAA 

14 41 TCATTTTTCT 
1501 TGAAGATACT 
1561 CTATCCTTAC 
1S21 ATGTAATATC 
1681 TTCTTCTTTT 



5449 a 4847 C 4902 g 5424 t 



CTTTGTCAGC 

ATTAATTAAT 

GTTCTCCAAA 

ACCTGCATCT 

TGTGGGACAG 

AAACGACAGA 

GAGACAGACA 

AGACCCTGAC 

CAAAGACCAA 

AAGCAGAGTG 

AGTATTACAG 

GTAAAGTACT 

GTTTAAATGA 

TATTTAGAGA 

GCAGCACACC 

AGATGTACTA 

CAACCACTGT 

GTGGCATACA 

AGGCCAGCCT 

CCTTGATTAA 

AC CAAACCAG 

ATACCCAATG 

TCACCACAGC 

CTTTCA^GGT 

TTATGAGGTG 

ACACTGGTCC 

CATC ATTTG T 

AGTGTGAGGA 

GAAAACTGTC 



TGTGCCTCCA 

CCCTCACCCG 

GGAGAGATAC 

GACTAAGAAG 

TAGAAAATCG 

GATTGTCAGT 

CACCTACTCA 

AGGCGCAAGG 

AGACTTGTTC 

TACTCTGATT 

ATTTTGTTGT 

CTTTAAAAAT 

TTTTTATTTT 

AAGATGGTTA 

CCTGTCAGAT 

AAGGGAAACA 

GGAAATCAGT 

CTTTTATTCC 

GGTCTATAGC 

ACCAAACCAA 

ACCAAACCAA 

GAGACTAAGT 

CAGGCTGTGG 

AAATGGACTC 

TCCATTCAGG 

CCACAGTTTA 

TGTAATTTTT 

AGTACAACTT 

GGTTCCTGAC 



ACAAAGGGGT 

CAAATCTTTC 

AGATGAGTGC 

AGCCACGGTG 

AGAGG CATGT 

CAGGCCAATC 

GTTGGAGGAA 

CCCTAACACA 

TCCATTAGAA 

GGAGAACTTT 

ACACTGCTGT 

GGGTCCTAGA 

GTTTAATATG 

GCTGTCAGAA 

GGCTGTGATC 

CACACACACA 

ATGAATGGTC 

CAG CACTGGG 

ACAGGTTCTA 

ACCAAACCAA 

AACACTGAAG 

CAGCAAGACA 

AACCAGCCTG 

TGCTGTGTAC 

AGTCACATGG 

CACTTTTATC 

CTTGATGACC 

GTTTTCTAAG 

ATCTGCTCAG 



ACTGTTGCCC 

AGTCACTAAG 

GTATAGGGTG 

TTAGTTGAAT 

GCCGTTTAGT 

CGTTTCGAGT 

GGATGAGAAC 

CACACCTACC 

ATGACAGCTG 

AATGTGTTTC 

TACATGTGGG 

TATTTTTTCC 

GAGGAAAAAG 

AAATATGCAA 

AAGAAAATAA 

CACACACACA 

CTCAAAAACC 

GAGGCAGAGG 

GGACAGCCAG 

ACCAAACCAA 

ATAGAACTTC 

CCTGCACAGC 

AGTGTCCATG 

ATGCCTCACA 

TAGTTCTATT 

AGCAGTGAAT 

CTCTTTCTGA 

TATTTATTGG 

GTATT C ATTG 



ACATAGAAAG 

TTAGCACGAT 

GAC CTGGCTG 

GGTGTGGAGT 

GAACTGATGG 

TTGATGGGCA 

AATGG CCAGC 

ACCTCACTTG 

GCTTGACCCG 

ATTCAGTATT 

GCAGTGTGTC 

TTTAACTCAA 

AAGCGTAAAT 

ATCAAAATCA 

ATGACAATGA 

CACACACACA 

TGAAGATAGA 

CAGGTGGATC 

GGCTACACAG 

ACCAAACCAA 

AGTATTCCAT 

CATGTTCACT 

ATAAATGAAT 

TTCTGTTTAT 

TTCAGTCTTC 

AAGGGTTCCT 

CAGGGATAGG 

CCCCTTGCAT 

GATGTTGTTT 
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Figure 5 (continued): 

1741 CTTTGGTGTT TGAGTTCTTA TGAATTCTAG ATGTTAAATC CCTGCCTGTG GTTCTCTCCC 
1801 ATT CTGTAGG CTGCCTCCTC ACCCTGGCAA TTGTTGTCCT TGTTTTGCAG AAACTTTTGA 
1861 CTTCATGGAA TCTCATTTGT CAGTTTTCCC TCCTCTGCTA TAG CCTGAGC TAATGCACTG 
1921 GTTTTTACAG AGCCCTGGTC TATG CCTTT A TCCTCCTCTG GCAGCTTCGG AGTTTCATTT 
1981 CTTACATTTA GATCTTTGAT CCACTTTGAA CAAGTTT TGG AGCAGGGTGA GAGATACGAA 
2041 TCTAGTTCCA TTCTTCCATA TGTGATCCTA GTTTACATAG CATCGTTGGT TGAAGAGGTT 
2101 TTATTTTATT TTTAAATAAT GTGTCATAAA AAACGAGGTG GTTGTAGCAG TGTGGATTTG 
2161 TTTCTTTGTC CTTTGATCTA CAGGT CTTGT TTTGTGTCAG TCTCATGATG TTTTATTGCT 
2221 ATGGCTCTGT CATACAGTCT GAGGTCAGGT ATTGTGATAT ACCTTCAGTA TTGCTCCCTC 
22 81 AGACTCAGGT TTGCTTTGGC CAGGAGTCAT CTT AC TCAGT GCTCTTAGAG CTCCCCCAGC 
2341 ATGTAGCTGC TACT ATT CTT AGTT GAT AAA TCAGGAAACT GGGGCTCAGA GAGATTAACT 
2401 GTCTTGAACT ACTTCTGGGG AGGTGAAACG TGGAGACACT AAACTGTGTT TACCCTGTAC 
2461 TGCTCCAGTA GCTGTCGGGT GCTGGGCTAC AG CAAAGCAC C TAT ACT AT A TATTACTCAG 
2521 GAGGTGGAAA AACTCAGCCT CCCTTGGGGT TCCCAAGCTC CCAGGTGTCC AGTCACTGCT 
2581 GGAAACCTCA TGGAGTCTGA AAGGAAGGGT TGAGGGTACA TGGGGCAGCG ATGAGGAGCC 
2641 TGGGGCTGGG ATCTCCCAAA CACCTGGATA TCCAGATGCC ACTGGGTCAG GGGGAGTTGG 
2701 GAACAGAGTT GGGATGT CCA TGGACCTGTG ACAAGGCCAG GGCCAGGGGG AGGATAACTC 
2761 TGGCTTTACT AATTTGCGAA AGTCCTTAGC TTAGCAGCAG TTGT CTGGGA GCACAGAGGG 
2821 GCCTTCTGTA AGAGGCTCAG GCAGTGCCGC TCTGTAGGCG AAGGTCTTCT CCATGTTCCC 
2881 CATGG TGGTT CTTGATGAAA GAGACAGTCC TTGGCTCCAA AC T GGTTT AT TGATTGTTCA 
2941 TTGTGGAAAA TGGGTGCACA CCACCTTCTC AGGGTGGACC AGAGATCAAA TACCTTTTGC 
3001 AGGGAGGAAT ATCTGGGAAG GGACGCTTAC TGGCTAAACC CTCAGGGCCT CTAGATACAT 
3061 CATTAGCATG GAGAACTCTG TTCTGGGCTA CATGACCACA GGCCACATTT CCACAAGCCA 
3121 CATGTGGGAA GTGTGGCACA TGTTCTAGGC CAGGAATCTG GTAGGGAGCG TGGAGCCACC 
3181 TACCATCCCA GGTGGGTGCC TGGGTGCCAG GGACCCTGAA CCCGCTCAAC CTTACCAAGT 
3241 TTCCTGGCAG GGT CCACTGT CCTACACAGA AGCTGGAGGA GGTGTGAGGG TTGTGTCTTT 
3301 GTGGAATGTC CCATGCTGCT TGGGGCTCAG TTTCTCCACC TGTACCTCAT TGGTTTGGGT 
3361 ATAAAAAGTG GGGATACTTT ATTATTCTCT GACTCGGTCC TGAGGAAAAA GCATCGTGGC 
3421 AGTCCAGGAA CCACACCCTG AGGTTCCTGC ACTGAAGGGA CTCCCTAAGT CTCTGGAGTC 
3481 TCTCCCCTTC ACAGAGCTGC CAAAGTCTAG GTTCTTTTGA GGATAACAGA GCCATGCTTG 
3541 GTAAG CAGAC AACAGCATTT GTTTACTCAA CCTTCTTTTG TCAGCTCCCT CTTCATAAAC 
3601 AAGTTGAGAC ACCATG CTGG CTTGAGGAAG ACTTCTAAAG CCAGACAACT GTGCAAGGAA 
3661 GAAGAAGAAG GGGCAAGTGG AGTTAGCCTG GATGTAGCCC TCAAAGTCTC CAGAGACCAG 
3721 CCATGAAGGC TCAAGTGGAG GGCAAGACCT GCAGCAG CCA AGCATCTGGC AGGAGAGGAT 
3 781 C CTGGGAAC C CCTCTACCAT GACACACATT CTTCCTGCAG GTCACACTTA ATAGGCCATT 

3 841 TCTTATTTGG ATCTATCATG GTGTTCTGTG CGAGATTAAT GAGGTGTTAT GCTGCGAACA 
3901 GAAAGTTATA TAAAAACAAG TCCCCCCCCC TTGTCACTGC TGCTAAGAAT GT AG CAG AAA 
3961 TTGTCTCAAG TGTCTCTCTA ATCAGAAACA ATAAAGGTCT CCTTGGATTC AAGCCCTCCA 

4 021 GTTTCCTCCT TCCTTGCTGA GCCTTGGACA CCCATACAAA CCTCCTGGAT GCTACAGCTC 
4081 TGGGCAGAGA CTCCAAGGTG GGGAGAGACT GATGGTACAA AAGCAAAATA CTTGTTTGGG 
4141 GGT AC AC C CA CTCCTCTGCC TGTGTGGTTC CTGCAGTCAG TCCT GCAGAC AGGCCCTCAG 
4 201 TGGGTCTTCC ATGGGCAACA CGCAGAGGGA GGCAATGGAT GGGAATACCC ACACCCTGGT 
4261 TAGTTTACCC CGGCCATGCT CTCTGCTCTT CATCCCTCCT CTGCCCTCTG CCACGGCTTT 
4321 CTCTGCAGGA ATCATATCTT CATATTGGCC CACAGGTGTT CTCCTCACCC TAGCTATGAT 
4381 GTTTACTTTA GAGTGACCTT AGCAGGG CTG GTGGGAATGA GTT CTAGAAG GCTCACGGAG 
4 441 ATGCTAGGGA AGAAACGTCT TCTAACTACT GAGGTTACTA AGTTCCTGGT GGTTGTCTCT 
4 501 GCCTTTCCCT TGTTAAAGTC ACCTTGAAGT TAGTGCAGAA GAAATCAGAG CCCAGTCACA 
4 561 GAGTAAATAT GGTCCTGAAG ATTT C CTTTG AGTGCCCAGA ATC CATGACA TTTCAAGAGC 
4 621 CCTCTTTGTA CCTTAAGTCA TTTGGGGTTG TATCTTCTGC TTGATGTATG TGTGTGTGTT 
4 681 TATCAAAGAG TGAGATGGTT ACATAAGAGG TGCTCTAAAG GACAGAGAGG ATTTGCAATT 
4741 GTGGCATGTG ACATCCT CAG GCCTTGCTCT GGTGCCAGGA GGAACTGATG CAGAAAAGAG 
4 801 T AAGAGGT C A TTTCCTGGAG GCTGTCACTA TAGAGGAGAT CTTACAGTGC ATTCCCTCCT 
4 861 CCAGGCCCTG CCTGAGGATA GACATGTGCT GACTGCAACT GAAACAGAGG CTTGGGATGG 
4S21 AGAGTTAGGT TCACAGAAGG GAGGGTGGGA GATGGATGCT TGCTGGGTTC TGGGTCTCAT 

4 981 CACCAGCTCC TGACCACCCG GTCAGCCCAT GTGCTTATTC CATAGCTTTC TTTTGCTATG 
5041 TTTAC TCAGT GTGGTGTTTG TTGGGACCCA GCAGAAGCCA GTCCCAGGCT GACAGCTGTG 
5101 GATACACAGG GCAGCATGAG GGTCCTCAGC CTGAAGCAGT CAGGCTGGCA GAAGAGAAAG 
5161 ACCAGCACAC ATTCCTTCAA CCAACTATGT CTTGAAAAAC AAA CAT ATT A TATCACATAT 
5221 ATTGCATTTA TGAGACAGCT AAAATGTACT CGGGTAGCAT GACTC CAGGT GGGGATATCT 
52 81 GCAAGTGCCA TGAGTGGCAG AGGGAC AG C C AATGTGAGGC AAGAAGGAAT TCTGGCTCAA 

5 341 CACAGCTTAG CTCCCTGGTG TTGGTTCAAA CTTTGAGAGT TTGACCACAA GCACTTTATT 
54 01 TTTGACATAT TTAAACAGAG CACAACTTTG GGAAAAAGTT TTCTTATGAA AAT TAT C AC A 

54 51 ATAAAGCTTA AGGCATGACT ACATTAAAAT GCCTTTGCAA AGTATATGTG CCCTCTTCCA 
5521 CAAG AATGGT TCTATTGACT GAGAAATAAT GTTCAGGATA AAGATCCAGG AAGAAAAGAT 

55 SI CAGGGATAAG T AAAAT ACT A AACTCTTTTG CAAAGTACAT AGACCCTCTT TCATAACAAT 

12/58 



RECTIFIED SHEET (RULE 91) 



WO 00/64247 



.*> "*'.».' •*■=-! ^ »*«u —v 

m tu> inr. id: inSi ilJ k£; 



7CA00/00430 



Figure 5 ('continued'): 

5641 GGGTTCTATT GACTGACAAG CACTGCTCAG GAGTTGGGAA AGAGTCTAGC ATAAGCACGA 
5701 TAGCCTGGAG ACTCTAGTGA GGTCTAGTCT TACAGACAGC AAAAATCACC AGGTTACAAA 
5761 CTACATTCAT TTCCAGTTTT CTGATCAGGC ACAGGTATGA ATCCCTTCTG TTGAAGAGAA 
" 5821 AAGT-CCATGT GTTTAAAATA TCTGGTTTCT CCAGTGCTAT TAGCGAGAAG ACTTGAGCCC 
5881 TATACAACTC CCACCTGGAG TGACATCCTG TCTTCATGGT ATATTACATA CCTAGACACG 
5941 CTCATCTCAC AGACTTAGGA CTTTGTCTTC TGATCTCCAT TTCTGATCCC ACTTCCACCT 
6001 TTGCCTTGAT AGTGTCATTT TCTTCACTGC CTTGGTGACA ACCATGTTAT CCTCTGTGTA 
606T TTTGAGTGTT ACCATTTTCA GATTTTACCT GTATGCAAGA TCACACAGTC TTTGTCTTTC 
6121 TGTCTGGATG CATGCTAATC T CTACACAAC AACCCTTCCC CGTCACT CAG ATCTTCCTCC 
6181 ATTAACACAT ACATGGTGCT GAAGAGGCTA GGGAGCTTCC CTTCAGTGGG GAGCTAGCTG 
6241 GCTATTGGGC CTTTTTGACT GTCCAGGAAG GCCCCCAATT GCTGAGACAA GAACTTAGAT 
6301 TCTTCATTAT TGACTCTAAC TCATGTATCA AGCAGAAGCT AATGAATAGT TATCAACAGG 
6361 ATCAGAGGTT CCAGTGTAAG ACACTTTGAC ATGAAAGAAC GGAGGAAGGA CAGATGGATG 
6421 CATAAAAGCA GGACCACTGC CCCAGGAAGG TCCTGGAAAC TGATGCAGGG CAAAGGACAG 
64 81 GTTATAAACC AAATCTTAGG GAGTCAGGAA GAGCACAGAG GAGCTCAACC AACTGACCAC 
6541 TGCTTAGGGG CTACCAACCC AATCCTCCCT GTGGGAACAG CTAAGCTATC AGCCAAGGGT 
6601 AATAAACAGG CAGGACCTGT GGATGACATG GAGAGCATAG GGACCCTGGG TCCAGCCTTT 
6661 AGCACCTGCA CTCTCAGGAT ACTCCACCAT TGTGTCTTAG AGAGCCTAGG GATACTGGGT 
6721 CCAGCCTTTG GTACCTTCAC T CTCAGGGT A CCCCATCACT GTGTCTTGGA GAGCCTAGGC 
67 Bl ACCCTGGGTC CAGCCTTCAG TACCTGCGCT CTCAGGACAC CCCACCATTG TCTCTTGCCC 
6841 CGTCTCTTCT TCCTCTTCCT CCCTTTCATT GTCTCTTCTC TGTTTCTTTC TTGACTCTCC 

6901 TTTCCCCTCA CACCCTCACT ctagttctcc ccttccctct ctgcatcacc ctattctctc 
6961 tgtggtccct ccactttcct ttatctctca tgcttctctc ctccctcaaa tacttgtcac 

7021 CCACTATACT TCAGGGGCCA GCTCTAGTGA CAAAGCTGTT AATAGCAAGA CTCTCAGATC 
70 81 TCCAACGGCT CAGAGGAGCC AGACCCACCA AGAACTCTCT CCAGGTCCAA TTTCAGGTTC 
7141 CTTCGAAAGC TTTCAGCAAA TGCTCAGGGA ACATGCCACT AACAAGAAGA TGCAAATTCC 
7201 AGTTGAGAGT GGGAAAGGCC CTTGCGTAGG TCCCATCTTC CAGGCCAAGG TCAGAGGGGC 
7261 TCTGTGTAAT CCGGATTGAC AGGGCTCAGA ACAATGTTTT GTTTTTAAGG TTTATTTATT 
7321 TTAGGTGTTA GTGTCTTTGC TTGCATGACC TTATGTGCAT CATGTGTGTG CAGGTTCCTG 
7381 ATGACAGTAG AGGAGGGCTT TGAATCCCTG GGGATAGGAA GTTACAGGAA ATTATAAGCT 
7441 GCTTTGTGGG TCTTCTAGCT TTCCCAACAG AAGTGAATGC TCTTCACCAC TGAGCCATCT 
7501 CTCTAGGCCC AAGAGACATT GCTTTATGGA TATAATTGTG TGTGTGTGTC AACATTGAGG 
7561 AAAGGGAAAT AAAAAAAAAA CTTCAGCCGC TAAGGTTGTA CAGTTTCACT AATTGCTACT 
7621 TTTAGTTGTG ATAAAATGGC AGGTGCTTCA ACATTTATAT ATACAAAAAC TTCCCTGCTG 
7681 GTGGTTCAAC TGTGAGAACT GGGGTAAGTG GGTGAGTTCT CTTTTTCTGT CTCTGTCTCT 
7741 GTCTCTCTCC TTCCATTCTT TCTTAAAGGA AATAAACATT GCAGCTGGGT TATAGCTCAT 
7801 CAATATGGAA GTTACAGAAG TGAAAAAAGG CATTGCCTTG GTGGGTGGTG TTACCAGCTG 
7861 ATTTTTGGTT GTCCTGCAAG GAGGTCTGGG GACTGGCTGC TCTGTCTCTG TCTGTATGAG 
7921 TGAGGGAAGT CTGGGGAGCA GATTCCCTAA CCTTCAGCCT GGCCTGGTTC CTGAGTGAAC 
7981 CCAGCCTCTC TGGTCCTAGT AGCTTTTTCC AAACAGGAAT CTGAGTGGTG ACAGGGAACA 
8041 AGTACCAGCC CATTGCTTAA GTGCCAGGGT TAGTGAGGGC AGGAAGCTGC CATAGCTGGG 
8101 ATTAGTAGTT GTATTGGATG TAGGAAGTCC TATCCTGGGA CAGCTAATCC TTAATGCTTC 
8161 ACTGGAGATT TTCAATGAGA AATTTATCCC ACGGCCCATA TGGCCCCATC CTTTTGTCTC 
8221 CAACAGCCAA GTATTTTCCA TTAGAGGAGA CTTCCTGTAC ACTTGATGGA TGCTCATTCC 
8281 AAGGTGACTT GGGGCAGTCA GTACAGACTT GGGATGACCT CTGACAGCCT AACCTCTCCC 
8341 CAACAAGGGC CCTCTATGTT TGCTATGTAA TGTAATGTCA GACATTGTCA GGAGTGTCCG 
8401 CAGCACAGCC TGCCCAGTGT GAGGGCTCTC ATAGGTTTCC CACTGTCTTA TCTACACAGG 
8461 GATAACGAGG AGGTAAGCTG CAGTTCCCAG TCTCACTT C A CAGAGGAAGA GATAACCCCA 
8521 TCCCAGGTCA TGTAGCCAGC AGTGGAAAGA ATGAGGATTT GAACTCAGGT CTTCCAAGTC 
8S81 CCATTGATAG CATCTCCTCA CAAGTCCCTT GCCACCCTCA CGATGCCTTA GACACTTGCC 
8641 TGCCCTTTAT ACTAAGGAGA TGCAGGTACA AGGGGTTTAC CCATGTAGCA GCTGAGGCAG 
8 701 CTGGGGATAG ATACCAGCAG CAGGCCTGAT GTCACCACTC TAACTCCAGC ATCCCCAGTC 
8761 TGTGTTCCTG GAGTGTGAAA ATCCCTACTT AACAAGATTG TGCAACAGTC CTTGGCTCTG 
8821 TGACCCATAG CTGGAAACAG GATTCTCATT GATTTGTGGA ACATGGTGGC AGCCAGCCAA 
88 81 AAAGAGGGTC TGCATACAGA AGACACGTGT GGCAAGGCCA CAGCAGACTC TGACTACCTT 

8 941 AGCTTACAGA ATTACAAGGT CATAATGTCC TCTGCTTTGG TCACCTCATG TTAAGGACAG 

9 001 GCCCTAATGA AGATGGGGCA GAAGACTGAA GGAATGGCCA ACCAATAACT GGCCCAACTT 
9061 GAGACCCATC CTACAGGCAA GCATCAATTC CTGACACTAC TAATGATACT CTGTTATGC, 
9121 TGCAGACAGA AGCCTAGCAT AACTATCCTC CGAGAGGTCC ACCCAGCAAC TGACTGAAAC 
91 ei AGAAAAAGAT ATCCACAGGC AAACAGTGGA TGGAGGTCAG GGACTATTAT GGGAGAGCTG 
9241 TGGGAAGGAT TAAAAACCCT GAAGGGGATA GGAACCCCAC AGGAAGACCA ACAGAGTCA*. 
Q301 CTAAGAGACC TGTGGGAGCT CTCAGAGACT GAGCCACCAA CCAAAGAGCA TACACAGGCC 
93 61 GGTCCGAGGC ACCTGGCACG TGTGAAGCAG ACATGCAGCT CAGTCTCCAT GTAGGTCCTC 
9*21 CAATAAGCGG TAGCCTGACT GCAGTATCCA ATCCCCAACA GGGCTGCATA GTCTGGCCTC 
9^81 AGTGGGGGAG GATGCCCCTA AT CCTGCAGA GACTTGATGA GTGGAGAGCT ATCCAGG^Gv, 
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Figure 5 (continued): 

9541 AACCCACCCT CTCTGAGAAG GGAATGGGGA TGGGGGAGGG ACTCTGTGAA GAGGGGACAA 
96 01 GGACAAACAA GAACCTCAAA TAGGTCAGGC CCTAAAGGCT TGCTAAGTAG CAGTGGCCCA 

96 61 GCTCTGTCCT GTTCCTCAGC CCAAGGCTCA GCTCCCACCT GTTTCTGTGT TTTTCTGGCT 
9721 TTTCATGGGC CTAGGACTTG GTGACCAGTT CAAACAATGG GGCCTGTGGA AGACACAATA 

97 81 TACAAGACTA GGGACATTCC TGTTCTGCTG ACTATCCATA GCCTGATGTA GGTGGAAGGA 
9841 CCCAATCACT GGATTTCTAC CCTTGCACAA CCTTGACAGC TGAGGGCCTC TCAGAAACCT 
9901 ATTTCTTCCA CTGAAAAATG AGACTCTCAA ATGAACGTCG TGACAATCAT CAGGCTTATT 
99 61 AAAGAGGTGT ATCTAACCTG AATGGCAAGC AGACAGCAGG CAAATGTCTG TATCAACCTC 

10021 TAGGAAGGAC AAGAACTGCT CACTGCTGCC CCCCAGGAGG CCATTTGCTG AAACAGCTGC 
10081 TCTCCTGCTG GTGCACAGGC CCTGCCTTCT CATTGCAGCC ACAGCCCCTT CCTGTCTGAA 
10141 CCTCCTGTCA GGTCACTGGG AAACAGATCA AGATGGAACA GGACAGCTCC TGATGGTAAA 
10201 TAAAAAACAG TGGTCATGGC TATTCATAGG GGTTTATGCT TCTTCAGTCC ACACTGTGAA 

102 61 GAGCTGTGGG CATGAACCAC AGTGTTCGAG GTAGAGTTGG GGTTCTGAAA TTCACAGTGG 
10321 GGTGAGCTCA GTAAATGTGA GCTGGAGGTC ACTCGTGAGA CACACAGTCC TGCTGCTTCT 

103 81 GTTCCCAATA TCCTGAGGAG ACGACACATC TACTTTGTTC AGAGG CCACA GTCTAGTTGA 
10441 CCTGAGAGTT ACCAGTTTCT TATTTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG 
10501 TGTTGTTCGT GTGTGAGTGC AGGTGCACAT ATGATAGCGT ACACGTTGAG GTCAGAGGAT 
10561 AAC TAT CAGG CGTTGTCCCC TCCTACTTTT CCTCGGACTC TGGAGAACAA AC ATG GGTCC 
10621 TTATTC CAGG GGAGCAAGTC GCTGTTGGCT GACACATCTT GCTCACATAC ATTTTACCTA 
106 81 GACAATGGAG CCTCCATCAG AGTATTACTT TAGCTCCTCA CCGATGGCAA TGCACCACCT 
10741 CTCTACCCAC ATAGGAGTTG GGTCTCCACA CACCCCCACA CCCCCTTCAC CAAAACGTTT 
10801 TCAGTTACTT TATCTGGTAA AGTTCATCAG AGAATGAAGC CAGTATTAAG AACATGGAAT 
10861 CATTTGGGAA CCTGGATCTA GCAATACCCC ACCCTAGATG GAGTTGCTGA GTTTTCACOT 
10921 CAGATTATAA TTCCCCCCTA GCTTCTATGG TTTATTCTGA AACCAGGGGA ACTCGATTCC 
10981 TCCCTTTGGA C CACAG AC AT CCTGGCTTGT GAATTCACAT GTCATCTACT GCTAATCCAT 
11041 TGGTAGTATG TGGCTCACAG AGACACACTA CAGTCATGGC CAATGTCAAG GTAGGACAGA 
11101 TGTGAATCAT TCCCCCAGTC CTGCTGTTTT CATGACTAAC CCTCCTCAGC ACAGTGACCA 
11161 TGAACCTACT TTTCCCCTCC TTTTATTTTT AGAATTGCTG GAATTTTCTA TTTTGAGAAA 
11221 TAATAGCCTT GGG CAGCATT AAACAAAATC ATCTAGAAAG CTGGTTTAAA ATACAGATGG 
11281 TTGAGTCAGT GAAAGAGTGA GGAATGTCAT TATTGGCCCC TCACAGAGGC TGGCTCACTC 
11341 CAGCAGAGGT GGTTGAAGCT CTTGGACACG GGTCAGGTGC ATAGGAAAGG TNGTCTGGGA 
114 01 CACTGAGAAC CACAATTGAA CAAACAGAAC TGTTGGCTTT TTTTTTTTTA AATGAGTTCT 
114 61 CAAAAAATGA CTGGCTAGCT TAGGCAAATA CTTCGAGCCA ACCCAACAGA ACATTCTTCC 
11521 ATTGATTCAT TCTGGATCTT CTTTCTAGAC AATACTGAAC TGACCCCTTG TTGGCAGTCT 
11581 CAAGTTTGAC AACAT AG GGC TTTGAACTTG GCACAAGGTC CATCACTGTC ACCCAAGCAT 
11641 CCTGGGTGAC CTTTGGG TTG GAATATCTTG GCTAACCTTA GATATTTTCT TTGGAGTATC 
11701 TT TAG AACAT CCAGGAAATA GGGCTTGATT CT CAT CCTGG GACCACAATA TAAGTCACCC 
11761 TAGAATCCCA GGAGATCGTG CAGAGAAACA AGGATCTCTC TCGTGTGCAT CCTTCTTCAA 
11821 AGCAGTGAGT AGTGACTCCA CTAAACTGAG TTCCCATCTG AGAGTCCACA GGAGGCTTTG 
11881 GGGCAAGAAG CAGAGGGAAG GCACTGTTTG TGTTGGTAAA GTTTTGACTC TAACAAATTT 
11941 GAAGACATAG ATGACATTGT GTCAGACTAA CAACAACCTA GACTCATGTG GGTTCTGTTT 

12 001 AGGGATCAGA TTTT ATT CAT CAATGACTTG TCTTAGTGTA TAGAGAAAGG CTTCCTACTG 
12061 GAGTGTAGGC TCAATAATGA CAGAAGAGAT AGCTATTTCC CCTAGGGACT GTGCTGCTCC 
12121 AAGTTTGGTG GAGAAAGGCA GTGGGGAACC TAG ATGTG CT CTCTGGGGAG GGGGT CTGAA 
12181 GCTGGCTTCA TAGAAGGTGT GAAGTTTTG C TGAAACATCT AAACAGAATT ATAGCTTAGG 
12241 AAAGTGAGCA GG CAAGGCAG GGAATGTGTT GCATATGTAT ATGTACATGA ATATATTATG 
123 01 TTATAGATAC ACACACATTT GAACCTCATT TGCAGATGAC AGAAAATAGG TTATTTTGCC 
123 61 TCTCTTAACT GCTAAGCACA ATGACTTCCA GTTCCATCCA TTTCCTGAAA TGC CACAATT 
12421 TCA T 'TT T T C A TTGTGGCTGA ATAAAATT CC ATTGCAGACT GGGCCCTACT TCATCCACTC 
12481 CTGAGGGCAG GCATATCCCC TGGCTCCATT TCTTACCTAT TGTGAAGAGA AGTGCAACTG 
12541 TCTTGTTGAA AGGCAAGCGT GAGAGAGGCA GG CACTAATT GTGGGTTTTT GTTTCTTCTT 
12601 CCTGCTATGA CTCTCCATTT GTCAGAACCA AAGATCGATA AAAGCCGCCA CCATGAAAGC 
12661 CATCTTAATC CCATTTTTAT CTCTTCTGAT TCCGTTAACC C CGCAAT CTG CATTCGCTCA 
12721 GAGTGAGCCG GAGCTGAAGC TGGAAAGTGT GGTGATTGTC AGTCGTCATG GTGTG CGTGC 
12781 TCCAACCAAG GCCACGCAAC TGATGCAGGA TGTCACCCCA GACGCATGGC CAACCTGGCC 
12841 GGTAAAACTG GGTTGGCTGA CACCGCGCGG TGGTG AG CTA ATCGC CTATC TCGGACATTA 
12901 CCAACGCCAG CGTCTGGTAG CCGACGGATT GCTGGCGAAA AAGGGCTGCC CGCAGTCTGG 
12961 TCAGGTCGCG ATTATTGCTG ATGTCGACGA GCGTACCCGT AAAACAGGCG AAGCCTTCGC 

13 021 CGCCGGGCTG GCACCTGACT GTGCAATAAC CGTACATACC CAGGCAGATA CGTCCAGTCC 
13 081 CGATCCGTTA TTTAATCCTC TAAAAACTGG CGTTTGCCAA CTGGATAACG CGAACGTGAC 
13141 TGACGCGATC CT CAG C AG GG CAGGAGGGTC AATTGCTGAC TTTACCGGGC ATCGGC AAA C 
132 01 GGCGTTTCGC GAACTGGAAC GGGTGCTTAA TTTTCCGCAA TCAAACTTGT GCCTTAAACG 
13 261 TGAGAAACAG GACGAAAGCT GTTCATTAAC G CAGG C ATT A CCATCGGAAC TCAAGGTGAG 
13 3 21 CGCCGACAAT GTCTCATTAA CCGGTGCGGT AAGCCTCGCA TCAATGCTGA CG GAG AT ATT 
13 381 TCTCCTGCAA CAAGCACAGG GAATGCCGGA GCCGGGGTGG GGAAGGATCA CCGATTCACA 
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Figure 5 (continued^): 

13 441 CCAGTGGAAC ACCTTGCTAA GTTTGCATAA CGCGCAATTT TATTTGCTAC AACGCACGCC 
13 501 AGAGGTTGCC CGCAGCCGCG CCACCCCGTT ATTAGATTTG AT CAAGACAG CGTTGACGCC 
13 561 CCATCCACCG CAAAAACAGG CGTATGGTGT GACATTACCC ACTTCAGTGC TGTTTATCGC 
13 621 CGGACACGAT ACT AAT CTGG CAAATCTCGG CGGCGCACTG GAGCTCAACT GGACGCTTCC 
13 6 81 CGGTCAGCCG GATAACACGC CGCCAGGTGG TGAACTGGTG TTTGAACGCT GGCGTCGGCT 
13 741 AAGCGATAAC AGCCAGTGGA TTCAGGTTTC GCTGGTCTTC CAGACTTTAC AGCAGATGCG 
13 801 TGATAAAACG CCGCTGTCAT TAAATACGCC GCCCGGAGAG GTGAAACTGA CCCTGGCAGG 
13 861 ATGTGAAGAG CGAAATGCGC AGGGCATGTG TTCGTTGGCA GGTTTTACGC AAATCGTGAA 
13 921 TGAAGCACGC AT AC C CGCTT GCAGTTTGTA AGGTACCCGG GGATCACAAC TTGCCCTCTG 
13 981 AAGAGGAAGA ACAGAAGGAT GCCACAACTC TCCTGCTGGC TACTCTCCAG TGGTTTCATC 
14041 TTACTTCTGA TGGCATTTCC CTCTAGAAAG TGCTACTATC ATCCACACAT TTCTACCTGA 
14101 GACCACCCAA AGGACCCTCC CAAATTCTCT TCCTCTCTGA GTAGTCTCCA CACCTGTTAC 
14161 CACCATCCCA GAATTAAAAT CCTAACTGCA CTCTGGCGTG TGAC TTGCCT CAGTCCTTGC 
142 21 AATAAGAGTT GTTGGCAGTG CCAGGCGTGG TGGCGCACGC CTTTAATTCC AGCACTTGGG 

142 81 AGGCAGAGGC AGGCGGATTT CTGAGTT CGA GGCCAGCCTG GTCTACAGAG TGAGTTCCAG 

143 41 GACAGCCAGG GCTATACAGA GAAACCCTGT GTCGAAAAAC CAAAAAAAAA AAAAAAAGTT 

144 01 GTTGG CAGAG TGTGGGTTAT ATACCAGGTG GAGATTTCAA ATGAGTGGCT GAAGCTGTAG 

144 61 CCAGAAGGAA CTTAGAGGAT AGCTCATAAC TTAAAAAGAA ATGTAGAGAG TAGCAGAAAC 

145 21 ATTGAGAGAG TGGGCACACA GCCACTGTGT GAATGTGGCA GAACACAATC CAGCCAGCTA 
14581 TACATGCATA AGTGTATATT GGCGCCATCC TGACTGATGA GACACAGGAA AACAGATAGA 
14641 CGGGGTTAGG TGGCCATGGC CTTTCCTGCC TGCCTCTTCC TAAGGGTCAT CTCAAGACCT 
14701 TATGCTCTCT TAACTCTTCC ATTGCTACTT AGCTTCTAGA TATCACCTCC AGATTAGTCT 
14761 CCTTGGGTAC ATCAGTGATC CTGGTGATAT CCAGGGCTTC CTGATTCCAT CTTTGTCATA 
14821 GAGGCTGCAA CTAAAGAGGT CTTCTTAATA CTTCACACCC TGATGCCAAA AGGAAGACAC 
14 881 AGAAGTTCAC AGAGGTGAAG TGATTCATGT AGGACATACA GTGAGCAAGC ATCAGGGTCC 
14 941 GGATTATCTG ACTCTACTCT AACTTTTATG TAAATGTGCT TTATGCCATT AACACTGTCA 
15001 TTCCTGTGCT TCAGCTCTGG GAGACTCCCA AGCACTCTTA GGCACAAGCC ACAATTAAGG 
15061 GACTCTGACA CTCTGCATTG ATTAATTAGC ATGGTGGTCT CTATGTTTCC AGATTCATGA 
15121 TTGTTTCACT TTCCATATAG GCTATGAAGG GTGTGAGGAA ATTTTTTGGG GACAGAATTG 
15181 GAGGCAATCC ACCTCTCTCA GGAAGCCTCT ATCTGGAAAA GCTTACAACT CAGGGACAGT 
15241 AACTGTAGGC CCAGTCCTTG GTGTC CAAAA TGGGTTTTAT GGTTTGAATC TGCAAAGCCT 
15301 TCCATGTGCT CAAAGGTTTG AACATGGAGC CTCCTCCTGG TAACACTGTA TTGGAGGCTT 

153 61 TTGAGACTGG ATGCTCTTTG GTCCCATGTT TTGCTACATC ATCTGTCAAG ATATGACCCA 
15421 GGCATGCTAC CAGCTACCAC AGACTATGCC TCTCCAGCTT TCATGTTCTC CCCACCATGA 

154 81 TAGACTTGTA TCTCCTAAAA ATGGAATCAA AGCAAACTTT TCCTGCATTA AGTTTTTTTT 
15541 TTTCTGTTAA GTGTTTGGTC ACAGGGACAA GAAAACACTC AATACAGATA ATTAGTACCA 
15601 GAGTTGAGGT TCATTGCTCT AGCAAGTTGG AT CAAATTTT TAGGGCTTTG GAACTGATTT 
15661 ATAAGAGACA TGTAGAAGAG TCTGAAGCTG TGGGCTACAG AAGTGTCACC AGTTTTTAAG 
15721 AATAGTTTAA TACACCATGG GAATTGTGAA AAT CAG AAT G CTCACACAAA GGCAGACAGG 
15 781 AAAACGTGAG CATGTGG CGT GTGAGAGGGC ATAAGAAGGA ACCTAGGGGG AAATGAGCTA 
15841 GAAGCCATTC GGCTACGTTA GGGAACGTGT GTGGCTGTGC TTGGCCCATG CCCTGGCAAT 

15 901 CTGAATGAGG C CAAATTTT A AAGGAGTGGA CTAACTCGAT TGTCAGAGAA AATATCAAGA 
15961 CAGACCACCA CTCAGGCTAT GCCGTGTTTG TGACCGACCA GCTACTCTTA GCCAGCTCTA 
16021 TTGTGAAATT C CAGAG CAAT TATCAGAGCA TGAAGATACA TACAGTTTAG TGAAGTAAGG 

16 081 GGTGTGGGTC CCTAAGTGGA TGGTGCATAA ATCTATGTAG GTGATGCCTA AGTGACACTT 
16141 GAT AAT C CAA AATATCAGCA ATGTGGAATG TCTTCCAAGG AGACCTGTAG ACACACATTT 
16201 TAGAACTTTG CTCATGGCTG TAATAAATAG CTAGCTAGAA ATCATTTCCT GAAGAGGTTA 
16261 GTCTGAGTTA CGGTTCCAGG GCAAACATTC AGTGATGGCA AGGAAGGCAT TGCAGTCAGG 
16321 AG CCAAAGG T CAGCTGGTCA CATTGCATCA AGAGTAGAGA GTCAGAGTGT GAGTAGAAAG 
163 81 AGGATACAGG TTATAAAACC TCACTGTCCA CTCTCAGCAA TCCATTTTCT CCTAAAAGGC 
16441 TTTACCTTCT AAAGATTTTA GTCTT CAAAA CCAGTACCAG TAG CCTGGGA ACAAAAGTTG 
165 01 AAACAAATGA GCCTTTGTGG GGCATTTCAC ACTTAAAACA GGGCATCACC TAGGAGGAGC 
16 561 CCTGTGTGCA GTAGGAAGTG TGGCCTCTGT GTCAGGAATG CTCAGGCTAA TAAGGGGTCC 
16 621 TCTATCTGAG GGACCCTATG AAGATTCAAC AAGTAGTTGT G AGAATTC C C TGTAAATGGA 
16 681 TGCTACCAAT TTGACATTTG TAGACCTGCT ATTGTGTGCT TCTTTATTGG GCTCTCCCAT 
16 741 CTCCCAACTT TCCAACCCAT ATTCCACATT AATCCCTTCC ACCACCATGC AACACTAGGT 
16 8 01 AGGAGAGAAG GAAGGTTAGA AGAGAAAGTG G GT AT AG AT C TATTTAGACT ACTTCCTGCT 
16 861 GATTAGGGGC AAGTCCAATC GTCATTGTCA GGATACCTCC AACCAGCAAC CAGCAAACCA 
16 921 GCAAATCAGA AAC AG CAAAA GCAGCCAACA AGGCAGCACT AACCAGCAGG ATTGGGGTCG 
169 81 GTAGCGTGGG AGCAGTCACT ACTGGTCTTC TCATGGCTTT GGCATTAATA CTCTCTCAAG 
17041 AAATTCCGTA ATTTTTTCCC CACCACCTGA AATTCCGTAA TTTTAAATGC AAACTATCTA 
17101 CAGCTGGCAA AAATCACATC TCTCCTAGAG CACAAGACAA ATCATAGTTA CTGGCTATTT 
17161 GCAATCTGAA G C AT C T CAAT ATCCCACACC TGGGATTAAA A C AAAAA CAT ATTCACATCA 
" 7221 CATAACTGTT TTTTTTTTCC AATTTT TT AT TAGGTATTTT CTTTATTTAC ATTTCAAATG 
172 81 CTATCCCGAA AGTCCCCTAT ACCCTCCCAC CTCCCTGCTC CCCTACACAC CCACTCCCAC 
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Figure 5 (continued): 

17341 TTTTTGACCC TGGAGTTCCC CGGTACTGGG GCATATAAAG TTTGCAAGAC CAAGGGGCCT 

174 01 CTCTTCCCAG TGATGGCCGA CTAAGCCATC TTCTGCTACA TATGCAGATA GAGACACGAG 
17461 CTCTGGGGGT ACTAGTTAGT TCATATTGTT GTTCCACCTA TAGGGT CGCA GACCCCTTCA 
17521 GCTCCTTGGG TACTTTGTCT AGCTCCTCCA CTGGGGGCTC TGTGTTTTAT CTAATAGATG 

175 61 ACTGTGAGCA TCCACTTCTG TATTTGACAG GCACTGGCCT AGCGTCACAT GAGCCAGCTA 
17641 TATCAGGGTC CTTTCAGCAA AACCTTGCTG GCATGTGCAA TAGTG TCTG C GTTTGGTGGT 
17701 TGATTATGGG ATGGATCCAC TAGTTCTAGA GCGGCCGCCA CCGCGGTGGA GCTCCAGCTT 
177 61 TTGTTCCCTT TAGTGAGGGT TAATTGCGCG CTTGGCGTAA TCATGGTCAT AGCTGTTTCC 
17821 TGTGTGAAAT TGTTATCCGC TCACAATTCC ACACAACATA CGAGCCGGAA GCATAAAGTG 
17881 TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA ACTCACATTA ATTGCGTTGC GCTCACTGCC 
17941 CGCTTTCCAG TCGGGAAACC TGTCGTGCCA GCTGCATTAA TGAATCGGCC AACGCGCGGG 
18O01 GAGAGGCGGT TTGCGTATTG GGCGCTCTTC CGCTTCCTCG CTCACTGACT CGCTGCGCTC 
18061 GGTCGTTCGG CTGCGGCGAG CGGTATCAGC TCACTCAAAG GCGGTAATAC GGTTATCCAC 
18121 AGAATCAGGG GATAACGCAG GAAAGAACAT GTGAGCAAAA GGCCAGCAAA AGGCCAGGAA 
18181 CCGTAAAAAG GCCGCGTTGC TGGCGTTTTT CCATAGGCTC CGCCCCCCTG ACGAG CATCA 
18241 CAAAAATCGA CGCTCAAGTC AGAGGTGGCG AAACCCGACA GGACTATAAA GATACCAGGC 
16301 GTTTCCCCCT GGAAGCTCCC TCGTGCGCTC TCCTGTTCCG ACCCTGCCGC TTACCGGATA 
18361 CCTGTCCGCC TTTCTCCCTT CGGGAAGCGT GGCGCTTTCT CATAGCTCAC GCTGTAGGTA 
18421 TCTCAGTTCG GTGTAGGTCG TTCGCTCCAA GCTGGGCTGT GTGCACGAAC CCCCCGTTCA 
184B1 GCCCGACCGC TGCGCCTTAT CCGGTAACTA TCGTCTTGAG TCCAACCCGG TAAGACACGA 
IB 541 CTTATCGCCA CTGGCAGCAG CCACTGGTAA CAGGATT AG C AGAG CGAGGT ATGTAGGCGG 
18601 TGCTACAGAG TTCTTGAAGT GGT GGCCTAA CTACGGCTAC ACTAGAAGGA CAGTATTTGG 
1B6S1 TATCTGCGCT CTGCTGAAGC CAGTTACCTT CGGAAAAAGA GTTGGTAGCT CTTGATCCGG 
18721 CAAACAAACC ACCG CTGGTA GCGGTGGTTT TTTTGTTTGC AAGCAGCAGA TTACGCGCAG 
18781 AAAAAAAGGA TCTCAAGAAG ATCCTTTGAT CTTTTCTACG GGGTCTGACG CTCAGTGGAA 
18841 CGAAAACTCA CGTTAAGGGA TTTTGGTCAT GAGATTATCA AAAAGGATCT TCAC CTAGAT 
18901 CCTTTTAAAT TAAAAATGAA GTTTTAAATC AATCTAAAGT ATATATGAGT AAACTTGGTC 
18961 TGACAGTTAC CAATG CTTAA TCAGTGAGGC ACCTATCTCA GCGATCTGTC TATTTCGTTC 
19021 ATCCATAGTT GCCTGACTCC CCGTCGTGTA GATAACTACG ATACGGGAGG GCTTACCATC 
19081 TGGCCCCAGT GCTGCAATGA TACCGCGAGA CCCACGCTCA CCGGCTCCAG ATTTATCAGC 
19141 AATAAACCAG CCAGCCGGAA GGGCCGAGCG CAGAAGTGGT CCTGCAACTT TATCCGCCTC 
19201 CATCCAGTCT ATTAATTGTT GCCGGGAAGC TAGAGTAAGT AGTTCGCCAG TTAATAGTTT 
19261 GCGCAACGTT GTTGCCATTG CTACAGGCAT CGTGGTGTCA CGCTCGTCGT TTGGTATGGC 
19321 TTCATTCAGC TCCGGTTCCC AACGATCAAG GCGAGTTACA TGATCCCCCA TGTTGTGCAA 
193 81 AAAAGCGGTT AGCTCCTTCG GTCCTCCGAT CGTTGTCAGA AGTAAGTTGG CCGCAGTGTT 
19 441 ATCACTCATG G TT AT GG C AG CACTGCATAA TTCTCTTACT GTCATGC CAT CCGTAAGATG 
19 501 CTTTTCTGTG ACTGGTGAGT ACTCAACCAA GTCATTCTGA GAATAGTGTA TGCGGCGACC 
19561 GAGTTGCTCT TGCCCGGCGT CAATACGGGA TAATACCGCG CCACATAGCA GAACTTTAAA 
19621 AGTGCTCATC ATTGGAAAAC GTTCTTCGGG GCGAAAACTC TCAAGGATCT TACCGCTGTT 
196 81 GAGATCCAGT TCGATGTAAC CCACTCGTGC ACCCAACTGA TCTTCAGCAT CTTTTACTTT 
19741 CACCAGCGTT TCTGGGTGAG CAAAAACAGG AAGGCAAAAT GCCGCAAAAA AGGGAATAAG 
198 01 GGCGACACGG AAATGTTGAA TACT CAT ACT CTT C C TTTTT CAATATTATT GAAGCATTTA 
19861 TCAGGGTTAT TGTCTCATGA GCGGATACAT ATTTGAATGT ATTTAGAAAA ATAAACAAAT 
19921 AGGGGTTCCG CGCACATTTC CCCGAAAAGT GCCACCTAAA TTGTAAGCGT TAATATTTTG 
19981 TTAAAATTCG CGTTAAATTT TTGTTAAATC AGCTCATTTT TTAAC CAAT A GGCCGAAATC 
20041 GGCAAAATCC CTTATAAATC AAAAGAATAG ACCGAGATAG GGTTGAGTGT TGTTCCAGTT 
20101 TGGAACAAGA GTCCACTATT AAAGAACGTG GACTCCAACG TCAAAGGGCG AAAAAC CGTC 
20161 TATCAGGGCG ATGGCC CACT ACGTGAACCA TCACCCTAAT CAAGTTTTTT GGGGTCGAGG 
20221 TGCCGTAAAG CACTAAATCG GAACCCTAAA GGGAGCCCCC GATTTAGAGC TTGACGGGGA 
20281 AAGCCGGCGA ACGTGGCGAG AAAGGAAGGG AAGAAAGCGA AAGGAGCGGG CGCTAGGGCG 
203 41 CTGGCAAGTG TAGCGGTCAC GCTGCGCGTA ACCACCACAC CCGCCGCGCH TAATGCGCCG 
20401 CTACAGGGCG CGTCCCATTC GCCATTCAGG CTGCGCAACT GTTGGGAAGG GCGATCGGTG 
20461 CGGGCCTCTT CGCTATTACG CCAGCTGGCG AAAGGGGGAT GTGCTGCAAG GCGATTAAGT 
2 0521 TGGGTAACGC CAGGGTTTTC CCAGTCACGA CGTTGTAAAA CGACGGCCAG TGAGCGCGCG 
2 0581 TAATACGACT CACTATAGGG CGAATTGGGT ACCGGGCCCC CCC 
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Figure 18: Nucleic acid sequence of the known segment of the RIS/appa+ intron plasmid, 
including the vector sequences of pBLCAT 3 (SEP ID NO:2). 



LOCUS R15/appa+intron 6708 bp DNA SYN 15-APR-2000 

DEFINITION R15/appa-t-intron transgene with vector cut 13543 to 4954 
ACCESSION R15/appa+intron 
REFERENCE 1 (bases 1 to 6708) ) 
SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence . 
KEYWORDS salivary proline-rich protein, acid glucose-l-phosphatase ; appA 
gene; periplasmic phospho anhydride phosphohydrolase ; artificial 
sequence ; 

Golovan, S., Forsberg, C.W. , Phillips, J. 
Unpublished. 



AUTHORS 
JOURNAL 



DEFINITION 
ACCESSION 
VERSION 
SOURCE 

ORGANISM 

Mammalia; 

Rattus . 

REFERENCE 

AUTHORS 

TITLE 
encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Rat salivary proline-rich protein (RP15) gene. 

M64793 M36414 

M64 7 93 . 1 GI ; 206711 

Rat (Sprague-Dawley) liver DNA. 

Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 

Eutheria; Rodentia; Sciurognathi ; Muridae,- Murinae; 

1 (bases 1 to 1748) 
Lin,H.H. and Ann., D . K. 

Molecular characterization of rat multigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/ Qualifiers 
1. . 1748 

/organism="Rattus norvegicus" 
/strain= M Sprague-Dawley" 
/db_xref="taxon: 10116" 
/tissue_type=" liver" 

/tissue_lib="cosmid genomic library" 
1802-1810 

/function=" consensus sequence for initiation in 
higher eukaryotes w 



misc feature 



FEATURES Locat i on/ Qual i f ier s 

DEFINITION E. coli periplasmic phospho anhydride phosphohydrolase (appA) 
gene, 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI: 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; 
Enterobacter iaceae ; 

Escherichia . 

REFERENCE ^ 1 (bases 1811.-3109) 

AUTHORS Dassa, J-, Marck,C. and Boquet,P.L. 

30/58 



RECTIFIED SHEET (RULE 91) 



^■a^^^JT^i^i J3 5 TS '33 G k£ 



WO 00/64247 PCT/CAOO/00430 



Figure 18 {continued): 

TITLE 



JOURNAL, 
MEDLINE 

FEATURES 

Source 



sig_peptide 
/gene=" appA" 
CDS 



The complete nucleotide sequence of the Escherichia coli 
gene appA reveals significant homology between pH 2.5 
acid phosphatase and glucose-l-phosphatase 
J. Bacterid. 172 (9), 5497-5500 (1990) 
90368616 



Location/Qualifiers 

1811. .3109 
/organism=" Escherichia coli" 
/ db_x r e f = " t axon : 5 6 2 " 
1811. . 1876 



1811- .3109 
/gene= " appA" 

/standard_name= ,, acid phosphatase/phytase" 
/ 1 r ans l_t ab 1 e = 1 1 

/product^ "periplasinic phospho anhydride 
phosphohydrolase " 
/protein_id="AAA72086 .1" 
/db_xref="GI : 145285" 

/translation="MKAILIPFLSLLIPLTPQSAFAQSEPELKLESWIVSRHGVRAP 
TKATQLMQDVTPDAWPTWPVKLGWLTPRGGELIAYLGHYQRQRLVADGL1JUCKGCPQS 

GQVAI I ADVDERTRKTGE AF AAGLAPDC AITVHTQADTSS PDPLFtfPLKTGVCQLDNA 

innTDAILSRAGGSIADFTGHRQTAFRELERVLNFPQSNLCLKREKQDESCSLTQALPS 

ELKVSADNVSLTGAVSIJVSMLTEIFLLQQAQGMPEPGWGRITDSHQWNTLLSLHNAQF 

YLLQRTPEVARSRATPLLDLIKTALTPHPPQKQAYGVTLPTSVLFIAGHDTOToANLGG 

ALELNWTLPGQPDNTPPGGELVFERWRRLSDNSQWIQVSLVFQTLQQMRDKTPLSLNT 

PPGEVKLTLAGCEERNAQGMCSLAGFTQIVNEARIPACSL" 

1877 3106 
/gene="appA" 

/product = ,, periplasmic phospho anhydride 
phosphohydrolase " 



mat_jpeptide 



mutation replace ( 18 17 . . 1819, "gcg changed to gcc") 
/gene=" appA" 

/standard_name="A3 mutant" 

/note=" created by site directed mutagenesis" 
/phenotype= n silent mutation" 
mutation replace { 3 092 3 094 , " ccg changed to ccc") 
/gene=" appA" 

/standard_name=" P42 8 mutant" 

/note=" created by site directed mutagenesis" 
/phenotype=" silent mutation ft 
mutation replace (3095 .. 3097 , " gcg changed to get") 
/gene=" appA" 

/standard_name=" A4 2 9 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 
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Figure 18 (continued'): 



DEFINITION Plasmid pBLCAT3 (bases 3109 to 67 08) 
ACCESSION X64409 
VERSION X64409.1 GI:58163 

SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence. 
REFERENCE 1 (bases 3109 to 6708) 
AUTHORS Luckow , B . H - R . 
TITLE Direct Submission 

JOURNAL Submitted ( 06 - FEB- 1992 ) B.H.R. Luckow, German Cancer Res 
Center, Im Neuenheimer Feld 280, W-6900 Heidelberg, FRG 
REFERENCE 2 (bases 3109 to 6708) 

AUTHORS Luckow , B . and S chut z , G . * — 

TITLE CAT constructions with multiple unique restriction sites 



for 

regulatory 

JOURNAL 
MEDLINE 
COMMENT 
experiments 



FEATURES 

source 



the functional analysis of eukaryotic promoters and 
elements 

Nucleic Acids Res. 15 (13), 5490 (1987) 
87260024 

Promoterless CAT vector for transient transfection 

with eukaryotic cells. Allows the analysis of foreign 
promoters and enhancers . 

Location/Qualifiers 
3109 to 6116 

/ o rgani s m= " syn t he tic con s t rue t 11 
/db_xref="taxon: 3263 0" 



SV4 0 t intron 

polyA_signal 

CDS 



3197 . . 3810 
/note="SV40 signals" 
3807 . . 4047 
/note="SV4 0 signals" 
complement (5244 . . 6104) 
/codon__start=l 
/trans l_table=ll 
/gene= "Amp n 

/product="beta- lactamase* 
/protein_id= "CAA4 5 75 3 . 1" 
/db xref = M GI :5S165" 



BASE COUNT 1916 a 1479 c 1515 g 1798 t 

ORIGIN 

1 GGATCCCCTT TGCTATGTAG TTTTTAATGG AAATTACAAC 
61 GAGAGTCCTG TTTGGTTTAA GGAACCTCTG TTTCTCATAA 
121 CTCTTTGTTT CTAGCATAAC CAAAAGATTT AGTGAATTGA 
181 ATAGGTCTAA TAACCCCGAA AATATT AC C A TGATACTGAG 
241 CATGTAGTAT CCATAGTCCA TCAATGAGAG AGACATTTAA 
3 01 TGGAAAAGAC ATGACAACAT TCACAGGCAC TGCACAGAAC 

3 61 TATTTCACTA AAC TAGGTTT ATCTATTTTG TTGCTTTCTC 
421 AG GTC AAC AG TGC C AC AT AT CCTTTACTTA ACCTAAGGAA 

4 81 TATCCTGGTT AG AG AG TG CT TAAAATAAGT TTTCCAAGAA 

5 41 TTAACAATTA AG AC AG T ATT TATTTAAAGC AAGAAATATG 

6 01 TGGGAAGAAA CCATTTGGTG AACAATATTT CAAATAAAAA 

6 61 TAAAACATAT GTTTGACCAG CCCTTCTTTT CAATAGGCTT 
721 GATTCTCTTT GGGTGGCTGC AAATTGTCCA CGAATAAGAC 

7 81 GAGTCTCACA AAATGAAAAG GAAATATATT CAGAAAGAGA 
84 1 C AC AAATT AA """K.GAAAAC CTG TGGTGAATGA CATCCTGAGG 
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CCATAGTGTG 
ACTCCATAAA 
AAACAATGTT 
CATTTGTAAG 
CATGATTTTC 
ATAGTGGTCC 
TAACATCTCT 
CACAAAAAAT 
TGGAAAAGAA 
AGGCACACAA 
TAGACAAACA 
AATG TG AAT A 
AAAAT AT AAA 
ATCTTGAGAG 
C CTG AG C TAT 



TTGATAAATA 
AACAGGAATA 
CCCTTAGAGT 
TATCTCATAG 
ATTAATCAGG 
A'CCTTGCACA 
GCAATGAAGC 
TTTCTACATA 
ATGTTCTGAC 
GAAAATATTT 
TAG TT AATTG 
AAATGTTAAA 
AATAAGGACT 
AATGTGTTGT 
TACTGACATT 
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Figure 18 (continued): 

901 TAAGATAAAG GTAACTGTAT ACATTTGTCC CATTGAGGGG ACAAGAAAGC TGCTCTCATG 
961 TTCAGCTCTA TAATTCTTGC CTTAAACAAC TTAAATAGAA TGATTTAAAA TATGGAGCTG 
1021 TCCATGGACC TTTGAAATAT AAAATAGTCA AG CAACTTAT CAAGGAATTA CAGATTCCTT 
10 81 GATACTAACA CAGGTAAATC CCACACGTGT TTTGAGACTA CATTTGCTGG GATTTTATTG 
1141 ATGTAATAGG TCACATGTTT TTCGGGCCAA TGTTGCTGTT ATTCGGTTAC TTCAAGAGAA 
12 01 TAGTGGCAAC TGATGCTATG T ATT CTAGGG GTTTGAAGTG ATGTTTCATG ATTGAAATTT 

12 61 GTAAAAGAAT AACATCATCA TTCTTAACAA TAGAACATAT AAAGTCACAC AGAAGTGACA 

13 21 GTGTTTAAGC TGTACTATTG ATCAAAGAAA TTTATTACCT TCAGTTTCAA TGGAAATAAT 

13 81 TACTGATAAT ACAAACATGT GTGAACACAC ACTAATCCTA TCCAAATGCA CAGTGATACA 

14 41 CAGAAAATAT TAGCAAGTAG AATGCAATAT TTATATAACG ATTGTATTTA TCAATCAATT 
1501 GTATGTATCA ATATATGGGC TATTTTCTTA CACATGATTT TATTCAAATT TACTC TAATC 

15 61 ATTGTTGAAC CATTTAGAAA AGG CAT ACTG GCAACTTTTC CTTACCTCAT CC AG CTGGGC 
1621 AAAAGTCCCA GTGTGGAGTA AAGGATGCAA GATTTCCTGC TCTGTTAAGT ATAAAATAAT 
1681 AGTATGAATT CAAAGGTGCC ATTCTTCTGC TTCTAGTTAT AAAGG CAGTG CTTGCTTCTT 
1741 CCAGCACAGA TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
1801 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 

18 61 GCAATCTGCA TTCGCTCAGA GTGAGCCGGA GCTGAAGCTG GAAAGTGTGG TGATTGTCAG 
1921 TCGTCATGGT GTGCGTGCTC CAACCAAGGC CACGCAACTG ATGCAGGATG TCACCCCAGA 

19 81 CGCATGGCCA ACCTGGCCGG TAAAACTGGG TTGGCTGACA CCGCGCGGTG GTGAGCTAAT 
2 041 CGCCTATCTC GGACATTACC AACGCCAGCG TCTGGT AG CC GACGGATTGC TGGCGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTACCCGTAA 
2161 AACAGGCGAA GCCTTCGCCG CCGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 
2 221 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
2 281 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGCTGACTT 
2 341 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTG CTTAATT TTCCGCAATC 
2 401 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAGCTGT TCATTAACGC AGGCATTACC 
2 461 AT CGG AACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
2 521 AATGCTGACG GAGATATTTC TCCTGCAACA AGCACAGGGA ATGCCGGAGC CGGGGTGGGG 
2 5 81 AAGGATCACC G ATTCAC AC C AGTGGAACAC CTTGCTAAGT TTGCATAACG CG CAATTTT A 
2 641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 
2 7 01 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATGGTGTGA CATTACCCAC 
2 7 61 TTCAGTGCTG TTTATC GCCG GACACGATAC TAATCTGGCA AATCTCGGCG GCGCACTGGA 
2 821 GCTCAACTGG ACGCTTCCCG GTCAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 
2 8 81 TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 

2 941 GACTTTACAG CAGATGCGTG ATAAAACGCC GCTGTCATTA AATACGCCGC CCGGAGAGGT 

3 001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
3 0 61 TTTTACGCAA ATCGTGAATG AAG C ACG CAT ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
3121 GTTATTGGTG CCCTTAAACG CCTGGTGCTA CGCCTGAATA AGTGATAATA AGCGGATGAA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG AACCTTACTT CTGTGGTGTG ACATAATTGG 
32 41 ACAAACTACC TACAGAGATT TAAAGCTCTA AGGTAAATAT AAAATTTTTA AGTGTATAAT 
3 3 01 GTGTTAAACT ACTGATTCTA ATTGTTTGTG TATTTTAGAT TCCAAC CTAT GGAACTGATG 
3 3 61 AATGGGAGCA GTGGTGGAAT GCCTTTAATG AGGAAAACCT GTTTTGCTCA GAAGAAATGC 
34 21 CATCTAGTGA TGATGAGGCT ACTG CTG ACT CTCAACATTC TACTCCTCCA AAAAAGAAGA 
34 81 GAAAGGTAGA AGACCCCAAG GACTTTCCTT CAGAATTGCT AAGTTTTTTG AGTCATGCTG 
3541 TGTTTAGTAA TAGAACTCTT GCTTGCTTTG CTATTTACAC CACAAAGGAA AAAGCTGCAC 
3601 TGCTATACAA GAAAATTATG GAAAAATATT CTGTAACCTT TATAAGTAGG CATAACAGTT 
3661 ATAATCATAA CAT AC TGTTT TTTCTTACTC C AC ACAGG CA TAGAGTGTCT GCTATTAATA 
3721 ACTATGCTCA AAAATTGTGT ACCTTTAGCT TTTTAATTTG TAAAGGGGTT AATAAGGAAT 
3781 ATTTGATGTA TAGTGCCTTG ACTAGAGATC ATAATCAG CC ATACCACATT TGTAGAGGTT 
3 841 TTACTTG CTT TAAAAAACCT CCCACACCTC CCCCTGAACC TGAAACATAA AATGAATGCA 

3 901 ATTGTTGTTG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG CAATAGCATC 
3961 ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTG CATTCTA GTTGTGGTTT GTCCAAACTC 

4 021 ATCAATGTAT CTTATCATGT CTGGATCGAT CCCCGGGTAC CGAGCTCGAA TT CG TAATC A 
4 081 TGGTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA CAATTCCACA CAACATACGA 
4141 GCCGGAAGCA TAAAGTGTAA AGCCTGGGGT GCCTAATGAG TGAGCTAACT CACATTAATT 
4 201 GCGTTGCGCT CACTGCCCGC TTTCCAGTCG GGAAACCTGT CGTGCCAGCT GCATTAATGA 
4261 ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTCTTCCGC TTCCTCGCTC 
4 321 ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TAT C AG C T C A CTCAAAGGCG 
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Figure 18 (continued): 

4 3 81 GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG AG CAAAAGGC 
4441 CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC 
4 5 01 CCCCCTGACG AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA 
4 561 CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC 
4621 CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC GCTTTCTCAA 
4 681 TGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT GGGCTGTGTG 
4741 CACGAACCCC CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG TCTTGAGTCC 
4 801 AACCCGGTAA GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG GATTAGCAGA 
4 861 GCGAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT 
4 921 AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG AAAAAGAGTT 

4 981 GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG 
5041 CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGG 
5101 TCTGACGCTC AGTGGAACGA AAACT CACGT TAAGGGATTT TGGTCATGAG ATTAXCAAAA 
5161 AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT CTAAAGTATA 
5221 TATGAGTAAA CTTGGTCTGA CAGTT AC CAA TGCTTAATCA GTGAGGC AC C TATCTCAGCG 
52 81 ATCTG TCT AT TTCGTTCATC CATAGTTGCC TGACTCCCCG TCGTGTAGAT AACTACGATA 
5341 CGGGAGGGCT TACCATCTGG CCCCAGTGCT GCAATGATAC CGCGAGACCC ACGCTCACCG 
54 01 GCTCCAGATT TATCAGCAAT AAACCAGCCA GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT 

54 61 GCAACTTTAT CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG AGTAAGTAGT 

5 521 TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTA CAGGCATCGT GGTGTCACGC 

55 81 TCGTCGTTTG GTATGGCTTC ATTCAGCTCC GGTTCCCAAC GATCAAGGCG AGTTACATGA 
5 541 TCCCCCATGT TGTGCAAAAA AG CGGTT AG C TCCTTCGGTC CTCCGATCGT TGTCAGAAGT 
57 01 AAGTTGGCCG CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC 
5761 ATGCCATCCG TAAGATGCTT TT CTGTG ACT GGTGAGTACT CAACCAAGTC ATTCTGAGAA 
5821 TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA TACCGCGCCA 
5881 CAT AG CAG AA CTTTAAAAGT GCT CATC ATT GGAAAACGTT CTTCGGGGCG AAAACTCTCA 

5 941 AGGATCTTAC CGCTGTTGAG ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT 
60 01 TCAGCATCTT TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC 
6061 GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT CCTTTTTCAA 
6121 TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT TGAATGTATT 
6181 TAGAAAAATA AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTGCC ACCTGACGTC 
6241 TAAGAAACCA TTATT AT CAT GACATTAACC TATAAAAATA GGCGTATCAC GAGGCCCTTT 

63 01 CGTCTCGCGC GTTTCGGTGA TGACGGTGAA AACCTCTGAC ACATGCAGCT CCCGGAGACG 

6 3 61 GTCACAGCTT GTCTGTAAGC GGATGCCGGG AG C AG AC AAG CCCGTCAGGG CGCGTCAGCG 
6421 GGTGTTGGCG GGTGTCGGGG CTGGCTTAAC TATGCGG CAT CAGAGCAGAT TGTACTGAGA 

64 81 GTGCACCATA TGCGGTGTGA AATACCG C AC AGATGCGTAA GGAGAAAATA CCGCATCAGG 
6541 CGCCATTCGC CATTCAGGCT GCGCAACTGT TGGGAAGGGC GATCGGTGCG GGCCTCTTCG 
6601 CTATTACGCC AGCTGGCGAA AGGGGGATGT GCTGCAAGGC GATTAAGTTG GGTAACGCCA 
6661 GGGTTTTCCC AGTCACGACG TTGTAAAACG ACGG C C AGTG CCAAGCTT 
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Figure 19: Nucleic acid sequence of the known segment of the R15/app a+intron transgene 
used for the generation of transgenic mice fSEO ID NO: 3). 



SYN 



15-APR-2O00 



LOCUS R15/appa 4060 bp DNA 

DEFINITION R15/appa transgene without vector 
ACCESSION R15/appa 

REFERENCE 1 (bases 1 to 4 0 60) 
SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence. 
KEYWORDS salivary proline-rich protein, acid glucose-l-phosphatase; appA 
gene; periplasmic phospho anhydride phosphohydrolase ; artificial 
sequence ; 

AUTHORS Golovan, S., Forsberg, C.W., Phillips, J. 

JOURNAL Unpublished. 

DEFINITION Rat salivary proline-rich protein (RP15) gene. 
ACCESSION M647 93 M3 6414 

M64793.1 GI:206711 
Rat (Sprague-Dawley) liver DNA. 
ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 



VERSION 
SOURCE 



Mammalia; 

Rattus . 

REFERENCE 
AUTHORS 
TITLE 
encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

1 (bases 1 to 1748) 
Lin, H . H . and Ann, D . K. 

Molecular characterization of rat multigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/Qualifiers 
1. . 1748 

/organism="Rattus norvegicus" 
/strain=" Sprague-Dawley" 
/db_3cref = " taxon 10116 " 
/tissue_type=" liver" 

/tissue_lib= M cosmid genomic library" 
1802-1810 

/function= " consensus sequence for initiation in 
higher eukaryotes " 



misc feature 



FEATURES Location/Qualifiers 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene , 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L0337S 

VERSION M58708.1 GI: 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; 
Enterobacteriaceae ; 

Escherichia . 

REFERENCE 1 (bases 1S11..3109) 

AUTHORS Dassa,J. , March , C . and Boquet,P.L. 
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Figure 19 f continued): 



TITLE 



JOURNAL* 
MEDLINE 

FEATURES 

Source 



sig_ peptide 
/gene="appA" 



The complete nucleotide sequence of the Escherichia coli 
gene appA reveals significant homology between pH 2 . 5 
acid phosphatase and glucose -1 -phosphatase 

J. Bacterid. 172 (9), 5497-5500 (1990) 

90368616 



Location/Qualifiers 

1B11. . 3109 
/ organi sm= "Escher i chi a coli" 
/ db_xre f="t axon : 5 6 2 ' * 
1811- . 1876 



CDS 



1811- .3109 

/gene=" appA" 
/standard_name="acid phosphatase/phytase " 

/ trans l_t able=ll 

/product="periplasmic phosphoanhydride 

phosphohydrolase " 

/pr o t ein_id= " AAA7 2 0 8 6 . 1 " 

/db_xref ="GI : 145285" 

/ trans lat ion= " MKAIL IPFLSLLIPLTPQSAFAQSEPELKLES VVI VSRHGVRAP 

TKATQLMQDVTPDAWPTWPVKLGWLTPRGGELIAYLGHYQRQRLVADGLLAKKGCPQS 

GQVAIIADVDERTRKTGEAFAAGIiAPDCAITVHTQADTSSPDPLFNPLKTGVCQLDNA 

NVTDAILSRAGGS I ADFTGHRQTAFRELERVLNFPQSNLCUKREKQDES CSLTQALPS 

ELKVSADNVSLTGAVSIASMLTEIFLLQQAQGMPEPGWGRITDSHQWNTLLSLHNAQF 

YLLQRTPEVARSRATPLLDLIKTALTPHPPQKQAYGVTLPTSVLFIAGHDTNLANLGG 

ALELNWTLPGQPDNTPPGGELVFERWRRLSDNSQWIQVSLVFQTLQQMRDKTPLSLNT 

PPGEVKLTLAGCEERNAQGMCSLAGFTQI TOTEARI PACSL " 



mat_peptide 



mutation 



mutation 



mutation 



1877 3106 
/gene="appA" 

/product="periplasmic phosphoanhydride 
phosphohydrolase " 

replace (1817 . . 1819,"gcg changed to gcc" ) 
/gene=" appA" 

/standard_name="A3 mutant" 

/note="created by site directed mutagenesis" 
/ pheno type =" silent mutation" 
replace (3092 .. 3094, " ccg changed to ccc") 
/gene=" appA" 

/standard_name=" P42 8 mutant" 

/note= "created by site directed mutagenesis" 
/phenotype=" silent mutation " 
replace (3095 . .3097, " gcg changed to get") 
/gene=" appA" 

/standard_name=" A42 9 mutant" 

/note=" created by site directed mutagenesis" 
/phenotype=" silent mutation " 
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Figure 19 CcontinuedV. 

SV4 0 t intron 
polyA_signal 



3197 . .3810 
/note="SV40 signals" 
3807 . .4047 
/note="SV40 signals" 



BASE COUNT 1257 a 814 C 843 g 1146 t 

ORIGIN 1 GGATCCCCTT TGCTATGTAG TTTTTAATGG AAATTACAAC CCATAGTGTG TTGATAAATA 
61 GAGAGTCCTG TTTGGTTTAA GCAACCTCTG TTTCTCATAA ACTCCATAAA AACAGGAATA 
121 CTCTTTGTTT CTAGCATAAC CAAAAGATTT AGTGAATTGA AAACAATGTT CCCTTAGAGT 
181 ATAGGTCTAA TAACCCCGAA AATATTACCA TGATACTGAG CATTTGTAAG TATCTCATAG 
241 CATGTAGTAT CCATAGTCCA TCAATGAGAG AGACATTTAA CATGATTTTC ATTAATCAGG 
301 TGGAAAAGAC ATGACAACAT TCACAGGCAC TG C AC AG AAC ATAGTGGTCC ACCTTGCACA 
3S1 TATTTCACTA AACTAGGTTT ATCTATTTTG TTGCTTTCTC TAACATCTCT GCAATGAAGC 
421 AGGTCAACAG TGCCACATAT CCTTTACTTA ACCTAAGGAA CACAAAAAAT TTTCTACATA 
4 81 TATCCTGGTT AGAGAGTGCT TAAAATAAGT TTTCCAAGAA TGGAAAAGAA ATGTTCTGAC 
541 TTAACAATTA AGACAGTATT TATTTAAAGC AAGAAATATG AGGCACACAA GAAAATATTT 
601 TGGGAAGAAA CCATTTGGTG AACAATATTT CAAATAAAAA TAGACAAACA TAGTTAATTG 
661 TAAAACATAT GTTTGACCAG CCCTTCTTTT CAATAGGCTT AATGTGAATA AAATGTTAAA 
7 21 GATTCTCTTT GGGTGGCTGC AAATTGTCCA CGAATAAGAC AAAATATAAA AATAAGGACT 
7 81 GAGTCTCACA AAATGAAAAG GAAATATATT CAGAAAGAGA ATCTTGAGAG AATGTGTTGT 
841 CACAAATTAA AGAAAAC CTG TGGTGAATGA CATCCTGAGG CCTGAGCTAT TACTGACATT 
901 TAAGATAAAG GTAACTGTAT ACATTTGTCC CATTGAGGGG ACAAGAAAGC TGCTCTCATG 
961 TTCAGCTCTA TAATTCTTGC CTTAAACAAC TTAAATAGAA TGATTTAAAA TATGGAGCTG 
1021 TCCATGGACC TTTGAAATAT AAAATAGTCA AG C AACTTAT CAAGGAATTA CAGATTCCTT 
10 81 GATACTAACA CAGGTAAATC CCACACGTGT TTTGAGACTA CATTTGCTGG GATTTTATTG 
1141 ATGTAATAGG TCACATGTTT TTCGGGCCAA TGTTGCTGTT ATTCGGTTAC TTCAAGAGAA 
12 01 TAGTGGCAAC TGATGCTATG TATTCTAGGG GTTTGAAGTG ATGTTTCATG ATTGAAATTT 

12 61 GTAAAAGAAT AACATCATCA TTCTTAACAA TAGAACATAT AAAGTCACAC AGAAGTGACA 
1321 GTGTTTAAGC TGTACTATTG ATCAAAGAAA TTTATTACCT TCAGTTTCAA TGGAAATAAT 

13 81 TACTGATAAT ACAAACATGT GTGAACACAC ACTAATCCTA TCCAAATGCA CAGTGATACA 
1441 CAGAAAATAT TAGCAAGTAG AATGCAATAT TTATATAACG ATTGTATTTA TCAATCAATT 
1501 GTATGTATCA ATATATGGGC TATTTTCTTA CACATGATTT TATTCAAATT TACTCTAATC 
1561 ATTGTTGAAC CATTTAGAAA AGGCATACTG GCAACTTTTC CTT AC CTC AT CCAGCTGGGC 
1621 AAAAGTCCCA GTGTGGAGTA AAGGATGCAA GATTTCCTGC TCTGTTAAGT ATAAAATAAT 
1681 AGTATGAATT C AAAGGTGC C ATTCTTCTGC TTCTAGTTAT AAAGGCAGTG CTTGCTTCTT 
1741 CCAGCACAGA TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
1801 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 
1861 GCAATCTGCA TTCGCTCAGA GTGAGCCGGA GCTGAAGCTG GAAAGTGTGG TGATTGTCAG 
1921 TCGTCATGGT GTGCGTGCTC CAACCAAGGC CACGCAACTG ATGCAGGATG TCACCCCAGA 
1981 CGCATGGCCA AC CTGG CCGG TAAAACTGGG TTGG CTGACA CCGCGCGGTG GTGAGCTAAT 
2 041 CGCCTATCTC GG AC ATT AC C AACGCCAGCG TCTGGTAGCC GACGGATTGC TGG CGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTACCCGTAA 
2161 AACAGGCGAA GCCTTCGCCG CCGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 
2 221 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
2 2 81 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGCTGACTT 
2 341 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTGCTTAATT TTCCGCAATC 

24 01 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAG CTGT TCATTAACGC AGGCATTACC 
2461 ATCGGAACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
25->l AATGCTGACG GAGATATTTC TCCTGCAACA AGCACAGGGA ATGCCGGAGC CGGGGTGGGG 

25 81 AAGGATCACC GATTCACACC AGTGGAACAC CTTGCTAAGT TTGCATAACG CGCAATTTTA 
2641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 
27 01 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATGGTGTGA CATTACCCAC 

27 61 TTCAGTGCTG TTTATCGCCG GACACGATAC TAATCTGGCA AATCTCGGCG GCGCACTGGA 
2 821 GCTCAACTGG ACGCTTCCCG GTCAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 

2 8 SI TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 
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Figure 19 (continued): 

2941 GACTTTACAG CAGATG CGTG ATAAAACGCC GCTGTCATTA AATACGCCGC CCGGAGAGGT 
3001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
3061 TTTTACGCAA ATCGTGAATG AAGCACGCAT ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
3121 GTTATTGGTG CCCTTAAACG CCTGGTGCTA CGCCTGAATA AGTGATAATA AGCGGATGAA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG AACCTTACTT CTGTGGTGTG ACATAATTGG 
3241 ACAAACTACC T AC AG AG AT T TAAAG CTCTA AGGTAAATAT AAAATTTTTA AGTGTATAAT 
3301 GTGTTAAACT ACTGATTCTA ATTGTTTGTG TATTTTAGAT TCCAACCTAT GGAACTGATG 
3 361 AATGGGAGCA GTGGTGGAAT GCCTTTAATG AGGAAAACCT GTTTTGCTCA GAAGAAATGC 
3421 CATCTAGTGA TGATGAGGCT ACTGCTGACT CTCAACATTC TACTCCTCCA AAAAAGAAGA 
3481 GAAAGGTAGA AGACCCCAAG GACTTTCCTT CAGAATTGCT AAGTTTTTTG AG TCATG CTG 
3 541 TGTTTAGTAA TAGAACTCTT GCTTGCTTTG CTATTTACAC CACAAAGGAA AAAGCTGCAC 
3601 TGCTATACAA GAAAATTATG GAAAAATATT CTGTAACCTT TATAAGTAGG CATAACAGTT 
3661 ATAATCATAA CATACTGTTT TTTCTTACTC CACACAGGCA TAGAGTGTCT GCTATTAATA 
3721 ACTATGCTCA AAAATTGTGT ACCTTTAGCT TTTTAATTTG TAAAGGGGTT AATAAGGAAT 
37 81 ATTTGATGTA TAGTGCCTTG ACTAGAGATC ATAATCAGCC ATACCACATT TGT AG AG GTT 
3 841 TTACTTGCTT TAAAAAACCT CCCACACCTC CCCCTGAACC TGAAACATAA AATGAATGCA 
3 901 ATTGTTGTTG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG CAATAGCATC 

3 961 ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA GTTGTGGTTT GTCCAAACTC 

4 021 AT C AATGTAT CTTATCATGT CTGGATCGAT CCCCGGGTAC 



38/58 

RECTIFIED SHEET (RULE 91) 



•j«Hi <i~»>. o*" ,L «— n t j.. if ~ju « ii«"i4 -—^ 

ifeaS * o tT.ii; o iLii k 

WO 00/64247 PCT/CAOO/00430 



Figure 20: Nucleic acid sequence of the known segment of the R15/app a plasmid (including 
the vector sequences of pBLCAT3 (SEP ID NO:4V 



LOCUS 

DEFINITION 
ACCESSION 
REFERENCE 
SOURCE 



R15/appa 6116 bp DNA 

R15/appa transgene with vector 
R15/appa 

1 (bases 1 to 6116) 
synthetic construct. 



SYN 



15-APR-2000 



ORGANISM synthetic construct 
artificial sequence. 
KEYWORDS salivary proline-rich protein, acid glucose -1 -phosphatase ; appA 
gene; periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence ; 

AUTHORS Golovan, S., Forsberg , C-W., Phillips, J. 

JOURNAL Unpublished. 

DEFINITION Rat salivary proline-rich protein (RP15) gene. 
ACCESSION M647 93 M3 6414 
VERSION M64793.1 GI:206711 

SOURCE Rat (Sprague-Dawley) liver DNA. 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 



Mammalia ; 

Rattus . 

REFERENCE 
AUTHORS 
TITLE 
encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 

1 (bases 1 to 1748) 
Lin,H.H. and Ann,D.K. 

Molecular characterization of rat multigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/Qualifiers 
1. .1748 

/organism= "Rattus norvegicus" 
/ strain= "Sprague-Dawley" 
/db_xref = " taxon: 10116 " 
/ 1 i s sue_type = " 1 i ver " 

/tissue_lib="cosmid genomic library" 
misc_f eature 1802-1810 

/function=" consensus sequence for initiation in 
higher eukaryotes " 



FEATURES Location/ Qualifiers 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene , 

ACCESSION M58708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI: 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; Enterobacceriaceae ; 

Escherichia. 

REFERENCE 1 (bases 181 1.. 3109) 

AUTHORS Dassa, J. , Marck,C. and Boquet,P.L. 

TITLE The complete nucleotide sequence of the Escherichia coli gene appA 

reveals significant homology between pH 2.5 acid phosphatase 
and glucose- I- phosphatase 

JOURNAL J . Saeteriol. 172 " (9) T S4S7-5500 -1990 
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Figure 20 rcontinued): 

MEDLINE 903 68616 

FEATURES 

Source 



sig_peptide 
/gene=" appA ' 
CDS 



Locat ion/Qual i f ier s 

1811- . 3109 
/organism=" Escherichia coli" 
/ db_xref = "taxon : 5 6 2 " 
1S11. . 1876 

1811 . .3109 

/gene= " appA" 
/ s tandard_name= " acid phosphatase/phytase" 
/transl_table=ll 

/product="periplasmic phosphoanhydride phosphohydrolase " 
/protein_id= "AAA72086 . 1 " 
/db_xref ="01 : 145285 " 

/ trans lation= "MKAILIPFLSLLIPLTPQSAFAQSEPELKLESVVIVSRHGVRAP 

TKATQLMQDVTPDAWPTWPVKI^WLTPRGGELIAYLGHYQRQRLVADGLLAKKGCPQ 

GQ VAI I ADVDERTRKTGEAFAAGiiAPDCAI TVHTQ ADTSS PDPLFNPLKTGVCQLDNA 

NVTDAILSRAGGSIADFTGHRQTAFRELERVI^FPQSNIiCIjKREKQDESCSLTQALPS 

ELKVSADOTSLTGAVSIaASMLTEIFLLQQAOGMPEPGWGRITDSHQWNTLLSI^HNAQF 

YLLQRTPEVARSRATPLIJDLIKTALTPHPPQKQAYGVTLPTSVLFIAGHDTNLANI.GG 

ALELNWTLPGQPDNTP PGGELVFERWRRLSDNSQWI QVSLVFQTLQQMRDKTPLSLNT 

PPGEVKLTLAGCEERNAQGMCSLiAGFTQI VNEARI PACSL " 

mat_peptide 1877 3106 

/gene= "appA" 

/ product = "periplastic phosphoanhydride phosphohydrolase" 

mutation replace ( 1817 . . 1819, "gcg changed to gcc" ) 

/gene= "appA" 

/ standard_name= "A3 mutant " 

/note* "created by site directed mutagenesis" 
/phenotype= "silent mutation" 
mutation replace ( 3 092 3094 , " ccg changed to ccc") 

/gene= " appA" 

/ s tandard_name = " P42 8 mutant" 

/note= "created by site directed mutagenesis" 
/phenotype= " silent mutation '* 
mutation replace (3 095 3 097 , " gcg changed to. get") 

/gene=" appA" 

/standard_name=" A4 2 9 mutant" 

/note= "created by site directed mutagenesis" 
/phenotype= 11 silent mutation " 



DEFINITION Plasmid pBLCAT3 (bases 3109 to 6116) 
ACCESSION X64409 

X64409.1 GI:58163 
synthetic construct, 
synthetic construct 
artificial sequence. 
1 (bases 3109 to 5116) 
Luckow, B . K . R . 
Direct Submission 
Submitted { 06 -FEB- 1992 ) : 
Center, Im Meuenh&imer F 

40/5 S 



VERSION 
SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



. K . R . Luckow , German Cancer Res 
id 230, W-6900 Heicelberc, FRG 
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Figure 20 (continued): 



REFERENCE 
AUTHORS 
TITLE 

for 

regulatory 

JOURNAL 
MEDLINE 
COMMENT 
experiments 

FEATURES 

source 



polyA_signal 



2 (bases 3109 to 6116) 
Luckow, B . and Schutz, G . 

CAT constructions with multiple unique restriction sites 
the functional analysis of eukaryotic promoters and 
elements 

Nucleic Acids Res . 15 (13), 5490 (1987) 
87260024 

Promoterless CAT vector for transient transfection 

with eukaryotic cells. Allows the analysis of foreign 
promoters and enhancers . 

Location/ Qualifiers 
3109 to 6116 

/organism=" synthetic construct" 
/db_xref= ,( taxon: 3263 0" 
3262 . . 3457 

/note= "SV40 signals" 



CDS complement (4654 5514 ) 

/ codon^s tart = 1 
/trans l_table=ll 
/gene= "Amp" 

/product= "beta- lactamase" 
/protein__id="CAA45753 .1" 
/db_xref="GI : 58165" 

1724 a 1386 c 1407 g 1599 t 



BASE COUNT 
ORIGIN 

1 GGATCCCCTT TGCTATGTAG 
61 GAGAGTCCTG TTTGGTTTAA 
121 CTCTTTGTTT CTAGCATAAC 
181 ATAGGTCTAA TAACCCCGAA 

2 41 CATGTAGTAT CCATAGTCCA 

3 01 TGGAAAAGAC ATGACAACAT 
361 TATTT CACTA AACTAGGTTT 
421 AGGTCAACAG TGCCACATAT 
481 TATCCTGGTT AGAGAGTGCT 
541 TTAACAATTA AGACAGTATT 
601 TGGGAAGAAA CCATTTGGTG 
661 TAAAACATAT GTTTGACCAG 
721 GATTCTCTTT GGGTGGCTGC 
781 GAGTCTCACA AAATGAAAAG 
841 CACAAATTAA AGAAAACCTG 
901 TAAGATAAAG GTAACTGTAT 
961 TTCAGCTCTA TAATTCTTGC 

1021 TCCATGG AC C TTTGAAATAT 
10 SI GATACTAACA CAGGTAAATC 
1141 ATGTAATAGG TCACATGTTT 
12 01 TAGTGGCAAC TGATGCTATG 

12 61 GTAAAAGAAT AACATCATCA 

13 21 GTGTTTAAGC TGTACTATTG 

13 81 TACTGATAAT ACAAACATGT 

14 41 C AG AAAAT AT TAGCAAGTAG 

15 01 GTATGTATCA ATATATGGGC 

15 61 ATTGTTGAAC C A TT TAG AAA 

16 21 AAAAGTCCCA GTGTGGAGTA 



TTTTTAATGG 

GCAACCTCTG 

CAAAAGATTT 

AATATTACCA 

TCAATGAGAG 

TCACAGGCAC 

AT CT ATTTTG 

CCTTTACTTA 

TAAAATAAGT 

TATTTAAAGC 

AACAATATTT 

CCCTTCTTTT 

AAATTGTCCA 

GAAATATATT 

TGGTGAATGA 

ACATTTGTCC 

CTTAAACAAC 

AAAATAGTCA 

CCACACGTGT 

TTCGGGCCAA 

TATTCTAGGG 

TTCTTAACAA 

ATCAAAGAAA 

GTGAACACAC 

AATGCAATAT 

TATTTTCTTA 

AGGCATACTG 

AAGG A*i G C^-A 
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AAATTACAAC 
TTTCTCATAA 
AGTGAATTGA 
TG AT AC TGAG 
AGACATTTAA 
TGCACAGAAC 
TTGCTTTCTC 
AC CTAAGG AA 
TTTCCAAGAA 
AAGAAATATG 
CAAATAAAAA 
CAATAGG CTT 
CGAATAAGAC 
CAGAAAGAGA 
CATCCTGAGG 
CATTGAGGGG 
TTAAATAGAA 
AGCAACTTAT 
TTTG AG AC T A 
TGTTGCTGTT 
GTTTGAAGTG 
TAGAACATAT 
TTTATTACCT 
ACTAATCCTA 
TTATATAACG 
CACATGATTT 
GCAACTTTTC 
GATTTCCTGC 



CCATAGTGTG 

ACT CC ATAAA 

AAACAATGTT 

CAT TT G T AAG 

CATGATTTTC 

ATAGTGGTCC 

TAACATCTCT 

CACAAAAAAT 

TGGAAAAGAA 

AGG CAC AC AA 

TAGACAAACA 

AATGTGAATA 

AAAATATAAA 

AT CTTG AG AG 

CCTGAGCTAT 

ACAAGAAAGC 

TGATTTAAAA 

CAAGGAATTA 

CATTTGCTGG 

ATTCGGTTAC 

ATGTTTCATG 

AAAGTCACAC 

TCAGTTTCAA 

TCCAAATGCA 

ATTGTATTTA 

TATTCAAATT 

CTTAC CT CAT 

TCTGTTAAGT 



TTGATAAATA 
AACAGGAATA 
CCCTTAGAGT 
TATCTCATAG 
ATTAATCAGG 
ACCTTG CAC A 
G CAATG AAG C 
TTTCTACATA 
ATGTT CTGAC 
GAAAATATTT 
TAGTTAATTG 
AAATGTTAAA 
AATAAGGACT 
AATGTGTTGT 
TACTGACATT 
TGCTCTCATG 
TATGG AG CTG 
CAGATTC CTT 
GATTTTATTG 
TTCAAGAGAA 
ATTGAAATTT 
AGAAGTGACA 
TGGAAATAAT 
CAGTGATACA 
TCAATCAATT 
TACTCTAATC 
CCAGCTGGGC 
ATAAAATAAT 
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Figure 20 ("continued): 

1681 AGTATGAATT CAAAGGTGCC ATTCTTCTGC TTCTAGTTAT AAAGGCAGTG CTTGCTTCTT 
1741 CC AG C AC AG A TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
1801 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 
1861 GCAATCTGCA TTCG CTCAGA GTGAGCCGGA GCTGAAGCTG GAAAGTGTGG TGATTGTCAG 
1921 TCGT CATGGT GTGCGTGCTC CAACCAAGGC CACGCAACTG ATGCAGGATG TCACCCCAGA 
1981 CGCATGGCCA ACCTGGCCGG TAAAACTGGG TTGGCTGACA CCGCGCGGTG GTGAGCTAAT 
2 041 CGCCTATCTC GGACATTACC AACGCCAGCG TCTGGTAGCC GACGGATTGC TGGCGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTACCCGTAA 
2161 AACAGGCGAA GCCTTCGCCG CCGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 
2221 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
22 81 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGCTGACTT 
2 3 41 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTGCTTAATT TTCCGCAATC 
24 01 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAGCTGT TCATTAACGC AGGCAETACC 
24 61 ATCGGAACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
2 521 AATGCTGACG GAGATATTTC TCCTGCAACA AGCACAGGGA ATGCCGGAGC CGGGGTGGGG 
2 5 81 AAGGATCACC GATTCACACC AGTGGAACAC CTTGCTAAGT TTGCATAACG CGCAATTTTA 
2 641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 
2 7 01 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATGGTGTGA CATTACCCAC 
27 61 TTCAGTGCTG TTTATCGCCG GACACGATAC TAATCTGGCA AATCTCGGCG GCGCACTGGA 
2 821 GCTCAACTGG ACGCTTCCCG GTCAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 
2 8 81 TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 

2 941 GACTTTACAG CAGATGCGTG ATAAAACGCC GCTGTCATTA AATACGCCGC CCGGAGAGGT 
3001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
3061 TTTTACGCAA ATCGTGAATG AAGCACGCAT ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
3121 GTTATTGGTG CCCTTAAACG CCTGGTGCTA CGCCTGAATA AGTGATAATA AGCGGATGAA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG AAC CTTACTT CTGTGGTGTG ACATAATTGG 
3241 ACAAACTACC TACAGAGATT TAAAAAACCT CCCACACCTC CCCCTGAACC TGAAACATAA. 
33 01 AATGAATGCA ATTGTTGTTG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG 

3 361 CAATAGCATC ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA GTTGTGGTTT 
3 421 GTCCAAACTC ATCAATGTAT CTTATCATGT CTGGATCGAT CCCCGGGTAC CGAGCTCGAA 
3 481 TTCGTAATCA TGGTCATAGC TGTTTCCTGT GTGAAATTGT TATCCGCTCA CAATTCCACA 
3 541 CAACATACGA GCCGGAAGCA TAAAGTGTAA AGCCTGGGGT GCCTAATGAG TGAGCTAACT 
3 601 CACATTAATT GCGTTGCGCT CACTGCCCGC TTTCCAGTCG GGAAACCTGT CGTGCCAGCT 
3 661 GCATTAATGA ATCGGCCAAC GCGCGGGGAG AGGCGGTTTG CQTATTGGGC GCTCTTCCGC 
3 721 TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 
3 781 CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG 
3 841 AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA 
3 901 TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA 

3 961 CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 
4021 TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 
4081 GCTTTCTCAA TGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT 
4141 GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG 

4 2 01 TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG 
4261 GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA 
4321 CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG 
43 81 AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTG GTTTTTT 
44*1 TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT 
AS01 TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG 
^56^ ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT 
4 621 CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC 
4681 TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC TGACTCCCCG TCGTGTAGAT 
J 741 AACTACGATA CGGGAGGGCT TACCATCTGG CCCCAGTGCT GCAATGATAC CGCGAGACCC 
4 801 £CGCrCACCG GCTCCAGATT TATCAGCAAT AAACCAGCCA GCCGGAAGGG CCGAGCGCAG 
*861 AAGTGGTCCT GCAACTTTAT CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG 
4 9^1 AGTAAGT AGT TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTA CAGGCATCGT 

4 9B1 GGTGTCACGC TCGTCGTTTG GTATGGCTTC ATTCAGCTCC GGTTCCCAAC GATCAAGGCG 

5 04 1 AGTTACATGA TCCCCCATGT TGTGCAAAAA AGCGGTTAGC TCCTTCGGTC CTCCGATCGT 
5101 TGTCAGAAGT AAGTTGGCCG CAGTGTTATC ACTCATGGTT ATC-GCAGCAC TGCA.TAATTC 
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Figure 20 (continued): 

5161 TCTTACTGTC ATGCCATCCG TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC 
5221 ATTCTGAGAA TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA TACGGGATAA 
52 81 TACCGCGCCA CATAGCAGAA CTTTAAAAGT GCTCATCATT GGAAAACGTT CTTCGGGGCG 
5 3 41 AAAACTCTCA AGGATCTTAC CGCTGTTGAG ATCCAGTTCG ATGTAACCCA CTCGTGCACC 
54 01 CAACTGATCT TCAGCATCTT TTACTTTCAC CAG CGTTTCT GGGTGAGCAA AAACAGGAAG 
54 61 GCAAAATGCC GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT 
5 521 CCTTTTTCAA TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT 
5 561 TGAATGTATT TAGAAAAATA AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTGCC 
5 641 ACCTGACGTC TAAGAAACCA TTATTATCAT G AC ATTAAC C TATAAAAATA GGCGTATCAC 
57 01 GAGGCCCTTT CGTCTCGCGC GTTTCGGTGA TGACGGTGAA AACCTCTGAC ACATGCAGCT 

57 61 C C CGGAGACG GTCACAGCTT GTCTGTAAGC GGATGCCGGG AG CAG ACAAG CCCGTCAGGG 
5821 CGCGTCAGCG GGTGTTGGCG GGTGTCGGGG CTGGCTTAAC TATGCGG CAT CAG AG CAGAT 

58 81 TGTACTGAGA GTGCACCATA TGCGGTGTGA AATACCGCAC AG ATG CGTAA GGAGAAAATA 

5 941 CCGCATCAGG CGCCATTCGC CATTCAGGCT GCGCAACTGT TGGGAAGGGC GATCGGTGCG 

6 001 GGCCTCTTCG CTATTACGCC AGCTGGCGAA AGGGGGATGT GCTGCAAGGC GATTAAGTTG 
6 061 GGTAACGCCA GGGTTTTCCC ACfTCACGACG TTGTAAAACG ACGGCCAGTG CCAAGC 
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Figure 21: Nucleic acid sequence of the known segment of the RIS/appa trans gene used for 
the eeneration of transgenic mice (SEP ED NO:5). 



LOCUS 

DEFINITION 
ACCESSION 
REFERENCE 
SOURCE 



RIS/appa 



3470 bp 



DNA 



SYN 



15-APR-2000 



RIS/appa transgene with vector sequences removed. 
R15/appa 

1 (bases 1 to 3470) 
synthetic construct . 



ORGANISM synthetic construct 
artificial sequence. 
KEYWORDS salivary proline-rich protein, acid glucose-1 -phosphatase ; appA 
gene; periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence; 

AUTHORS Golovan, S., Forsberg, C.W., Phillips, J. 

JOURNAL Unpublished. 

DEFINITION Rat salivary proline-rich protein (RP15) gene. 
ACCESSION M64793 M36414 
VERSION M64793.1 GI:206711 

SOURCE Rat (Sprague-Dawley) liver DNA. 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebra ta; 



Mammalia; 

Rattus . 

REFERENCE 

AUTHORS 

TITLE 
encoding 

JOURNAL 
MEDLINE 
FEATURES 

source 



Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; 

1 (bases 1 to 1748) 
Lin ,H.H. and Ann , D . K . 

Molecular characterization of rat tmiltigene family 

proline-rich proteins 
Genomics 10, 102-113 (1991) 
91257817 

Location/Qualif iers 
1 . . 1748 

/organism^ 11 Rattus norvegicus" 
/strain*" Sprague-Dawley" 
/db_xref="taxon: 10116" 
/tissue_type=" liver" 

/tissue_lib=" cosmid genomic library" 
miscjeature 1802-1810 

/function^" consensus sequence for initiation in 
higher eukaryotes w 



FEATURES Location/Qualifiers 

DEFINITION - E. coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene , 

ACCESSION M5B708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI:145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision; Enterobacteriaceae ; 

Escherichia . 

REFERENCE 1 (bases 181 1-3109) 

AUTHORS Dassa, J. , Marck f C. and 5oquet,P.L. 

TITLE The complete nucleotide sequence of the Escherichia coli gene appA 

reveals significant homology between pH 2.5 acid phosphatase 
and glucose- 1- phosphatase 
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Figure 21 f continued): 

JOURUAL J. Bacterid. 172 (9), 5497-5500 (1990) 
MEDLINE 90368616 

FEATURES Location/ Qualifiers 

Source 1811. .3109 

/organism^" Escherichia coli TI 
/db_xref="taxon: 562" 

sig Deptide 1811.. 1876 

/gene=" appA" 
CDS 1811. .3109 

/gene= "appA" 

/standard_name="acid phosphatase/phytase" 
/transl_table=ll 

/product="periplasmic phosphoanhydride phosphohydro^ase" 
/protein_id= n AAA720B6 .1" 
/db_xref="GI: 145285" 

/trans la t ion= 11 MKAILI PFLSLiLI PLTPQS AFAQSEPELKLESVVT VSRHGVRA? 
TKATQI^QDVTPDAWPTWPVKLGWLTPRGGELIAYLGHYQRQRLVADGLLAKKGCPQS 
GQVAI I ADVDE RTRKTGEAF AAGLAPDCAI TVHTQADTS S PD PLFN PLKTGVCQLDN A 
NVTDAILSRAGGSIADFTGHRQTAFRELERVX^JFPQSNIiCLKREKQDESCSLTQALPS 
ELKVSADNVSLTGAVSI1ASMLTEIFLI1QOAQGMPEPGWGRITDSHQWNTLLSLHNAQF 
YLI^RTPEVARSRATPX.LDLIKTALTPHPPQKQAYGVTLPTSVLFIAGHDTNLANLGG 

ALELNWTLPGQPDNTPPGGELVFERWRRIiSDNSQWIQ 

PPGEVKLTLAGCEERNAOGMCSLAGFTQIVNEARIPACSL" 

mat_jpeptide 1S77 3106 

/gene="appA" 

/product=" periplastic phosphoanhydride phosphohydrolase ' 

mutation replace ( 1817 . . 1819, "gcg changed to gcc M ) 

/gene=" appA" 

/standard_name= "A3 mutant" 

/note=" created by site directed mutagenesis" 
/phenotype=" silent mutation" 
mutation replace (3092 . . 3 094 , " ccg changed to ccc") 

/gene= M appA" 

/standard_name= " P42 8 mutant" 

/note=" created by site directed mutagenesis" 
/phenotype= " silent mutation " 
mutation replace { 30 95 . .3097 , « gcg changed to get") 

/gene="appA" 

/standard_name=" A42 9 mutant" 
/note= M created by site directed mutagenesis" 
/phenotype= M silent mutation " 



polyA_signal 32 62. .3457 

/note="SV4 0 signals " 

BASE COUNT 1065 a 721 c 735 g 949 t 

ORIGIN 

1 GGATCCCCTT TGCTATGTAG TTTTTAATGG AAATTACAAC CCATAGTGTG TTGATAAATA 

61 GAGAGTCCTG TTTGGTTTAA GCAACCTCTG TTTCTCATAA ACTCCATAAA AACAGGAATA 

1?1 CTC-PTTGTTT CTAGCATAAC CAAAAGATTT AGTGAATTGA AAACAATGTT C C CT T AG AGT 

181 ATAGGTCTL^ TAACCCCGAA AATATTACCA TGATACTGAG CATTTGTAAG TATCTCATAG 

241 CATGTAGTAT CCATAGTCCA TCAATGAGAG A? AC ATTT AA CATGATTTTC ATTAATCAGG 
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Figure 21 (continued^: 

301 TGGAAAAGAC ATGACAACAT TCACAGGCAC TGCACAGAAC ATAGTGGTCC AC CTTGCACA 
361 TATTTCACTA AACTAGGTTT ATCTATTTTG TTGCTTTCTC TAACATCTCT GCAATGAAGC 
421 AGGTCAACAG TGCCACATAT CCTTTACTTA ACCTAAGGAA CACAAAAAAT TTTCTACATA 
481 TATCCTGGTT AGAGAGTGCT TAAAATAAGT TTTCCAAGAA TGGAAAAGAA ATGTTCTGAC 
541 TTAACAATTA AGACAGTATT TATTTAAAGC AAGAAATATG AGGCACACAA GAAAATATTT 
601 TGGGAAGAAA CCATTTGGTG AACAATATTT CAAATAAAAA TAGACAAACA TAGTTAATTG 
661 TAAAACATAT GTTTGAC C AG CCCTTCTTTT CAATAGGCTT AATGTGAATA AAATGTTAAA 
721 GATTCTCTTT GGGTGGCTGC AAATTGTCCA CGAATAAGAC AAAATATAAA AATAAGGACT 
781 GAGTCTCACA AAATGAAAAG GAAATATATT CAGAAAGAGA ATCTTGAGAG AATGTGTTGT 
841 CACAAATTAA AGAAAACCTG TGGTGAATGA CATCCTGAGG CCTGAGCTAT TACTGACATT 
901 TAAGATAAAG GTAACTGTAT ACATTTGTCC CATTGAGGGG ACAAGAAAGC TGCTCTCATG 
961 TTCAGCTCTA TAATTCTTGC CTTAAACAAC TTAAATAGAA TGATTTAAAA TATGGAGCTG 
1021 TCCATGGACC TTTGAAATAT AAAATAGTCA AG C AACTT AT CAAGGAATTA CAGAT^CCTT 
10 81 GATACTAACA CAGGTAAATC CCACACGTGT TTTGAGACTA CATTTGCTGG GATTTTATTG 
1141 ATGTAATAGG TCACATGTTT TTCGGGCCAA TGTTGCTGTT ATTCGGTTAC TTCAAGAGAA 

12 01 TAGTGGCAAC TGATGCTATG TATTCTAGGG GTTTGAAGTG ATGTTTCATG ATTGAAATTT 
1261 GTAAAAGAAT AACATCATCA TTCTTAACAA TAGAACATAT AAAGTCACAC AGAAGTGACA 

13 21 GTGTTTAAGC TGTACTATTG ATCAAAGAAA TTTATTACCT TCAGTT-TCAA TGGAAATAAT 
13 81 TACTGATAAT ACAAACATGT GTGAACACAC ACTAATCCTA TCCAAATGCA CAGTGATACA 
1441 CAGAAAATAT TAGCAAGTAG AATGCAATAT TTATATAACG ATTGTATTTA TCAATCAATT 
15 01 GTATGTATCA ATATATGGGC TATTTTCTTA CACATGATTT TATTCAAATT TACTCTAATC 

15 61 ATTGTTGAAC CATTTAGAAA AGG CATACTG GCAACTTTTC CTTACCTCAT CCAG CTGGGC 
1621 AAAAGTCCCA GTGTGGAGTA AAGGATGCAA GATTTCCTGC TCTGTTAAGT ATAAAATAAT 

16 81 AGTATGAATT CAAAGGTGCC ATTCTTCTGC TTCTAGTTAT AAAGGCAGTG CTTGCTTCTT 
1741 CCAGCACAGA TCTGGATCTC GAGGAGCTTG GCGAGATTTT CAGGAGCTAA GGAAGCTAAA 
18 01 AGCCGCCACC ATGAAAGCCA TCTTAATCCC ATTTTTATCT CTTCTGATTC CGTTAACCCC 
18 61 GCAATCTGCA TTCGCTCAGA GTGAGCCGGA GCTGAAGCTG GAAAGTGTGG TGATTGTCAG . 
1921 TCGTCATGGT GTGCGTGCTC CAACCAAGGC CACG CAACTG ATGCAGGATG TCACCCCAGA 
1981 CGCATGGCCA ACCTGGCCGG TAAAACTGGG TTGGCTGACA CCGCGCGGTG GTGAGCTAAT 
2 041 CGCCTATCTC GGACATTACC AACGCCAGCG TCTGGTAGCC GACGGATTGC TGGCGAAAAA 
2101 GGGCTGCCCG CAGTCTGGTC AGGTCGCGAT TATTGCTGAT GTCGACGAGC GTAC CCGTAA 
2161 AAC AGG CG AA GCCTTCGCCG C CGGGCTGGC ACCTGACTGT GCAATAACCG TACATACCCA 

22 21 GGCAGATACG TCCAGTCCCG ATCCGTTATT TAATCCTCTA AAAACTGGCG TTTGCCAACT 
2281 GGATAACGCG AACGTGACTG ACGCGATCCT CAGCAGGGCA GGAGGGTCAA TTGC TGACTT 

23 41 TACCGGGCAT CGGCAAACGG CGTTTCGCGA ACTGGAACGG GTGCTTAATT TTCCGCAATC 
2 401 AAACTTGTGC CTTAAACGTG AGAAACAGGA CGAAAGCTGT TCATTAACGC AGGCATTACC 

24 61 ATCGGAACTC AAGGTGAGCG CCGACAATGT CTCATTAACC GGTGCGGTAA GCCTCGCATC 
2 521 AATGCTGACG GAGATATTTC TCCTGCAACA AG C AC AGGG A ATGCCGGAGC CGGGGTGGGG 
2581 AAGGATCACC GATTCACACC AGTGGAACAC CTTGCTAAGT TTGCATAACG CGCAATTTTA 
2 641 TTTGCTACAA CGCACGCCAG AGGTTGCCCG CAGCCGCGCC ACCCCGTTAT TAGATTTGAT 
2 701 CAAGACAGCG TTGACGCCCC ATCCACCGCA AAAACAGGCG TATG GTGTG A CATTACCCAC 
2 7 61 TTCAGTGCTG TTTATCGCCG GACACGATAC TAATCTGG C A AATCTCGGCG GCGCACTGGA 
2 821 GCTCAACTGG ACGCTTCCCG GTCAGCCGGA TAACACGCCG CCAGGTGGTG AACTGGTGTT 
2 881 TGAACGCTGG CGTCGGCTAA GCGATAACAG CCAGTGGATT CAGGTTTCGC TGGTCTTCCA 

2 941 GACTTTACAG CAGATGCGTG ATAAAACGCC GCTGTCATTA AATACGCCGC C CGG AG AGGT 

3 001 GAAACTGACC CTGGCAGGAT GTGAAGAGCG AAATGCGCAG GGCATGTGTT CGTTGGCAGG 
3 0 61 TTTTACGCAA ATCGTGAATG AAGCACGCAT ACCCGCTTGC AGTTTGTAAG GTATAAGGCA 
3121 GTTATTGGTG CCCTTAAACG C CTGGTGCT A CGC CTGAATA AGTGATAATA AG CGGATG AA 
3181 TGGCAGAAAT TCGCCGGATC TTTGTGAAGG AACCTTACTT CTGTGGTGTG ACATAATTGG 
3 241 ACAAACTACC TACAGAGATT TAAAAAACCT CCCACACCTC CCCCTGAACC TGAAACATAA 
3 3 01 AATGAATGCA ATTGTTGTTG TTAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG 

33 61 CAATAGCATC ACAAATTTCA CAAATAAAGC ATTTTTTTCA CTGCATTCTA GTTGTGGTTT 

34 21 GTCCAAACTC ATCAATGTAT CTTATCATGT CTGGATCGAT CCCCGGGTAC 
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Figure 22: Nucleic acid sequence of the SV40/APPA+mtron plasmid (SEP ED NO:6). 

LOCUS SV40/APPA 5421 bp DNA CIRCULAR SYN 14-APR-2000 

DEFINITION Ligation of SV40 promoter /enhancer into CAT/APPA+intron 

ACCESSION SV4 0 /APPA 

REFERENCE 1 (bases 1 to 5421) 

SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence . 
KEYWORDS SV40 promoter /enhancer , acid glucose-1 -phosphatase; appA gene; 
periplasmic phosphoanhydride phosphohydrolase ,- artificial 
sequence ; 

AUTHORS Golovan, S., Forsberg, C.W. , Phillips, J. 

JOURNAL Unpublished. 

DEFINITION E. coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene, 

ACCESSION MS8708 L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI : 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria ; gamma subdivision.; Enterobacteriaceae; 

Escherichia - 

REFERENCE 1 (bases 40 1337) 

AUTHORS Dassa,J., Marck,C. and Boquet,P-L. 

TITLE The complete nucleotide sequence of the Escherichia coli gene appA 

reveals significant homology between pH 2.5 acid phosphatase 
and glucose-1 -phosphatase 

JOURNAL J. Bacterid- 172 (9), 5497-5500 (1990) 

MEDLINE 90368616 

FEATURES Location/Qualifiers 
Source 4 0 13 3 7 

/organism^" Escherichia coli" 
/ db_xref = " t axon : 5 6 2 " 

sig_peptide 40.. 105 

/gene=" appA" 
CDS 40 1337 

/gene=" appA" 
/ s t anda r d_natne = " a c i d pho spha t a s e / phy t ase" 

/transl_table=ll 

/product = "periplasmic phosphoanhydride phosphohydrolase" 
/protein_id= " AAA72086 . 1" 
/db_xref ="GI : 145285" 

/translations "MKAILIPFLSLLI PLTPQ3AFAQSEPELKLESWIVSRHGVRAP 

T KATQLMQ DVT PD AW PTW P VKLGW LT ? RGGE L I AYLGHY QRQRLVADGL LAKKGC P Q S 

GQVAIIADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDPLFNPLKTGVCQLDNA 

NVTDAILSRAGGSIADFTGHRQTAFRELERVLNFPQSNLCLKREKQDESCSLTQALPS 

ELKVSADNVSLTGAVSIASMLTSIFLLQQAQGMPEPGWGRITDSHQWNTLLSLHNAQF 

YLLQRTPEVARSRATPLLDLIKTALTPHPPQKQAYGVTLPTSVLFIAGKDTNLANLGG 

ALELNV7TLPGQPDNTPPGGELVFERWRRLSDNSQV7IQVSLVFQTLQQMRDKTPLSLNT 

P PGEV KLT LAG C E E RN AQGMC S LAG FTQ I VNE AR I P AC S L " 



mat_peptide 106 1334 

/gene= "appA" 
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Figure 22 f continued): 



mutation 



mutat xon 



mutation 



/product= w periplasmic phosphoanhydride phosphohydrolase" 

replace (46.. 4 8,"gcg changed no gcc") 
/gene="appA" 

/standard_name= "A3 mutant" 

/not e= "created by site directed mutagenesis" 
/phenotype=" silent mutation" 
replace (1320. .1322, " ccg changed to ccc'M 
/gene=" appA" 

/standard^ name=" P428 mutant" 

/note=" created by site directed mutagenesis" 

/phenotype=" silent mutation " 

replace (1323. .1325, " gcg changed to get") 

/gene= M appA" 

/standard_name= " A429 mutant" 

/note="created by site directed mutagenesis" 
/phenotype=" silent mutation " 



DEFINITION Plasmid pBLCAT3 (bases 2200 to 4924) 
ACCESSION X64409 
VERSION X64409.1 GI: 58163 

SOURCE synthetic construct. 

ORGANISM synthetic construct 
artificial sequence. 
REFERENCE 1 (bases 2200 to 4924) 
AUTHORS Luckow , B . H . R . 
TITLE Direct Submission 

JOURNAL Submitted (06 -FEB- 19 92 ) B.H.R. Luckow, German Cancer Res 
Center, Im Neuenheiraer Feld 280, W-6900 Heidelberg, FRG 
REFERENCE 2 (bases 2200 to 4 924) 
AUTHORS Luckow, B. and Schutz , G . 

TITLE CAT constructions with multiple unique restriction sites 



for 

regulatory 

JOURNAL 
MEDLINE 
COMMENT 
experiment s 

FEATURES 

source 



the functional analysis of eukaryotic promoters and 
elements 

Nucleic Acids Res. 15 (13), 5490 (1987) 
87260024 

Promoterless CAT vector for transient transfection 

with eukaryotic cells. Allows the analysis of foreign 
promoters and enhancers . 

Location/ Qualifiers 
2200 to 4924 

/organisms" synthetic construct" 
/ db_xr e f = " t axon : 3 2 6 3 0 " 



SV4 0 t intron 

polyA_signal 

CDS 



1380 . . 1993 

/note= " SV40 signals " 
1990 . . 2230 

/note= "SV40 signals'" 
complement (3 471. .4317) 
/codon_start=l 
/trans l_table= 11 
/gene= "Amp" 

/product = "beta- lactamase 1 
/protein_id= "CAA4 57 5 3 . 1" 
/db xref = " G I : 58165 11 
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Figure 22 (continued'): 



SV40 promoter/enhancer 5023.. 5402 
/noce="SV40 signals' 



BASE COUNT 1413 a 1321 c 1331 g 1355 t 

ORIGIN ^ cGAGATTTTC AGGAGCTAAG GAAGCTAAAA GCCGCCACCA TGAAAGCCAT CTTAATCCCA 
61 TTTT TATCTC TTCTGATTCC GTTAACCCCG CAATCTGCAT TCGCTCAGAG TGAGCCGGAG 
121 CTGAAGCTGG AAAGTGTGGT GATTGTCAGT CGTCATGGTG TGCGTGCTCC AACCAAGGCC 
181 ACGCAACTGA TGCAGGATGT CACCCCAGAC GCATGGCCAA CCTGGCCGGT AAAACTGGGT 
241 TGGCTGACAC CGCGNGGTGG TGAGCTAATC GCCTATCTCG GACATTACCA ACGCCAGCGT 
301 CTGGTAGCCG ACGGATTGCT GGCGAAAAAG GGCTGCCCGC AGTCTGGTCA GGTCGCGATT 
361 ATTGCTGATG TCGACGAGCG TACCCGTAAA ACAGGCGAAG CCTTCGCCGC CGGGCIGGCA 
421 CCTGACTGTG CAATAACCGT ACATACCCAG GCAGATACGT CCAGTCCCGA TCCGTTATTT 
4 81 AATCCTCTAA AAACTGGCGT TTGCCAACTG GATAACGCGA AC G TG AC TG A CGCGATCCTC 
541 AGCAGGGCAG GAGGGTCAAT TGCTGACTTT ACCGGG CATC GGCAAACGGC GTTTCGCGAA 
SOI CTGGAACGGG TGCTTAATTT TCCGCAATCA AACTTGTGCC TTAAACGTGA GAAACAGGAC 
661 GAAAGCTGTT CATTAACGCA GGCATTACCA TCGGAACTCA AGGTGAGCGC CGACAATGTC 
721 TCATTAACCG GTGCGGTAAG CCTCGCATCA ATGCTGACGG AGATATTTCT CCTGCAACAA 
761 GCACAGGGAA TGCCGGAGCC GGGGTGGGGA AGGATC AC CG ATTCACACCA GTGGAACACC 
841 TTGCTAAGTT TGCATAACGC GCAATTTTAT TTGCTACAAC GCACGCC AGA GGTTGCCCGC 
901 AGCCGCGCCA CCCCGTTATT AGATTTGATC AAGACAGCGT TGACGCCCCA CCACCGCAAA 
961 AACAGGCGTA TGGTGTGACA TTACCCACTT CAGTGCTGTT TATCGCCGGA CACGATACTA 
1021 ATCTGGCAAA TCTCGGCGGC GCACTGGAGC TCAACTGGAC GCTTC CCGGT CAGCCGGATA 
10 81 ACACGCCGCC AGGTGGTGAA CTGGTGTTTG AACGCTGGCG TCGGCTAAGC GATAACAGCC 
ll^l AGTGGATTCA GGTTTCGCTG GTCTTCCAGA CTTTAC AG C A GATGCGTGAT AAAACGCCGC 
12 01 TGTCATTAAA TACGCCGCCC GGAGAGGTGA AACTGACCCT GGCAGGATGT GAAGAGCGAA 

12 61 ATGCGCAGGG CATGTGTTCG TTGGCAGGTT TTACGCAAAT CGTGAATGAA GCACGCATAC 
1321 CCGCTTGCAG TTTGTAAGGC AGTTATTGGT GCCCTTAAAC GCCTGGTGCT ACGCCTGAAT 

13 81 AAGTGATAAT AAGCGGATGA ATGGCAGAAA TTCGCCGGAT CTTTGTGAAG GAACCTTACT 
1441 TCTGTGGTGT GACATAATTG GACAAACTAC CTACAGAGAT TTAAAGCTCT AAGGTAAATA 
15 01 TAAAATTTTT AAGTGTATAA TGTGTTAAAC TACTGATTCT AATTGTTTGT GTATTTTAGA 

15 61 TTCCAACCTA TGGAACTGAT GAATGGGAGC AGTGGTGGAA TGCCTTTAAT GAGGAAAACC 

16 21 TGTTTTGCTC AGAAGAAATG CCATCTAGTG ATGATGAGGC TACTGCTGAC TCTCAACATT 

16 81 CTACTCCTCC AAAAAAGAAG AGAAAGGTAG AAGACCCCAA GGACTTTCCT TCAGAATTGC 

17 41 TAAGTTTTTT GAGTCATGCT GTGTTTAGTA ATAGAACTCT TGCTTGCTTT GCTATTTACA 

18 01 CCACAAAGGA AAAAGCTGCA CTGCTATACA AGAAAATTAT GGAAAAATAT TCTGTAACCT 
18 61 TTATAAGTAG GCATAACAGT TATAATCATA ACATACTGTT TTTTCTTACT CCACACAGGC 
1921 ATAGAGTGTC TGCTATTAAT AACTATGCTC AAAAATTGTG TACCTTTAGC TTTTTAATTT 
1981 GTAAAGGGGT TAATAAGGAA TATTTGATGT ATAGTGCCTT GACTAGAGAT CATAATCAGC 
2041 CATACCACAT TTGTAGAGGT TTTACTTGCT TTAAAAAACC TCCCACACCT CCCCCTGAAC 
2101 CTGAAACATA AAATGAATGC AATTGTTGTT GTTAACTTGT TTATTGCAGC TTATAATGGT 
2161 TACAAATAAA GCAATAGCAT CACAAATTTC ACAAATAAAG CATTTTTTTC ACTGCATTCT 
2221 AGTTGTGGTT TGTCCAAACT CATCAATGTA TCTTATCATG TCTGGATCGA TCCCCGGGTA 
22 81 CCGAGCTCGA ATT CGTAATC ATGGTCATAG CTGTTTCCTG TGTGAAATTG TTATCCGCTC 
2341 ACAATTCCAC ACAACATACG AGCCGGAAGC ATAAAGTGTA AAGC CTGGGG TGCCTAATGA 
24 01 GTGAGCTAAC TCACATTAAT TGCGTTGCGC TCACTGCCCG CTTTCCAGTC GGGAAACCTG 
24 61 TCGTGCCAGC TGCATTAATG AATCGGCCAA CGCGCGGGGA GAGGCGGTTT GCGTATTGGG 
2521 CGCTCTTCCG CTTCCTCGCT CACTGACTCG CTGCGCTCGG TCGTTCGGCT GCGGCGAGCG 
2581 GTATCAGCTC ACTCAAAGGC GGTAATACGG TTATCCACAG AATCAGGGGA TAACGCAGGA 
2 641 AAGAACATGT GAGCAAAAGG CCAGCAAAAG GCCAGGAACC GTAAAAAGGC CGCGTTGCTG 
2 7 01 GCGTTTTTCC ATAGGCTCCG CCCCCCTGAC GAGCATCACA AAAATCGACG CTCAAGTCAG 
2 7 61 AGGTGGCGAA ACCCGACAGG ACTATAAAGA TACCAGGCGT TTCCCCCTGG AAGCTCCCTC 
2 821 GTGCGCTCTC CTGTTCCGAC CCTGCCGCTT ACCGGATACC TGTCCGCCTT TCTCCCTTCG 

2 8 81 GGAAGCGTGG CGCTTTCTCA ATGCTCACGC TGTAGGTATC TCAGTTCGGT GTAGGTCGTT 
7 o 41 CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CCGACCGCTG CGCCTTATCC 

3 001 GGTA^CTATC GTCTTGAGTC CAACCCGGTA AGACACGACT TATCGCCACT GGCAGCAGCC 
3061 ACTGGTAACA GG ATT AG C AG AGCGAGGTAT GTAGGCGS-TG CTACAGAGTT CTTGAAGTGG 
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Figure 22 ("continued'): 

3121 TGGCCTAACT ACGGCTACAC TAGAAGGACA GTATTTGGTA TCTGCGCTCT GCTGAAGCCA 
3181 GTTACCTTCG GAAAAAGAGT TGGTAGCTCT TGATCCGGCA AACAAACCAC CGCTGGTAGC 
3241 GGTGGTTTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC TCAAGAAGAT 

33 01 CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG AAAACTCACG TTAAGGGATT 
3361 TTGGTCATGA GATTATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA AAAATGAAGT 

34 21 TTTAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTT AC CA ATGCTTAATC 
34 81 AGTGAGGCAC CTATCTCAGC GATCTGTCTA TTTCGTTCAT CCATAGTTGC CTGACTCCCC 
3541 GTCGTGTAGA TAACTACGAT ACGGGAGGGC TTACCATCTG GCCCCAGTGC TGCAATGATA 
3601 CCGCGAGACC CACGCTCACC GGCTCCAGAT TTATCAGCAA TAAACCAGCC AGCCGGAAGG 
3661 GCCGAGCGCA GAAGTGGTCC TGCAACTTTA TCCGCCTCCA TCCAGTCTAT TAATTGTTGC 
3721 CGGGAAGCTA GAGTAAGTAG TTCGCCAGTT AATAGTTTGC GCAACGTTGT TGCCATTGCT 
37 Bl ACAGGCATCG TGGTGTCACG CTCGTCGTTT GGTATGGCTT CATTCAGCTC CGGTTCCCAA 
3 841 CGATCAAGGC GAGTTACATG ATCCCCCATG TTGTGCAAAA AAGCGGTTAG CTCCTSCGGT 
3 901 CCTCCGATCG TTGTCAGAAG TAAGTTGGCC GCAGTGTTAT CACTCATGGT TATGGCAGCA 

3 961 CTGCATAATT CTCTTACTGT CATGCCATCC GTAAGATGCT TTTCTGTGAC TGGTGAGTAC 
4021 TCAACCAAGT CATTCTGAGA ATAGTGTATG CGGCGACCGA GTTGCTCTTG CCCGGCGTCA 
40 81 ATACGGGATA ATACCGCGCC ACATAGCAGA ACTTTAAAAG TGCTCATCAT TGGAAAACGT 
4141 TCTTCGGGGC GAAAACTCTC AAGGAT CTT A CCGCTGTTGA GATCCA-GTTC GATGTAACCC 

42 01 ACTCGTG C AC CCAACTGATC TTCAGCATCT TTTACTTTCA CCAGCGTTTC TGGGTGAGCA 
42 SI AAAACAGGAA GGCAAAATGC CGCAAAAAAG GGAATAAGGG CGACACGGAA ATGTTGAATA 
4321 CTCATACTCT TCCTTTTTCA ATATTATTGA AGCATTTATC AGGGTTATTG TCTCATGAGC 

43 81 GGATACATAT TTGAATGTAT TTAGAAAAAT AAACAAATAG GGGTTCCGCG CACATTTCCC 
4441 CGAAAAGTGC CACCTGACGT CTAAGAAACC ATTATTATCA TGACATTAAC CTATAAAAAT 
4 SOI AGGCGTATCA CGAGGCCCTT TCGTCTCGCG CGTTTCGGTG ATGACGGTGA AAACCTCTGA 
4561 CACATGCAGC TCCCGGAGAC GGTCACAGCT TGTCTGTAAG CGGATGCCGG GAGCAGACAA 
4621 GCCCGTCAGG GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA 

4 6 81 TCAGAGCAGA TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA CAGATGCGTA 
4741 AGGAGAAAAT ACCGCATCAG GCGCCATTCG CCATTCAGGC TGCGCAACTG TTGGGAAGGG 
4 8 01 CGATCGGTGC GGGCCTCTTC GCTATTACGC CAGCTGGCGA AAGGGGGATG TGCTGCAAGG 
4 8 61 CGATTAAGTT GGGTAACGCC AGGGTTTTCC CAGTCACGAC GTTGTAAAAC GACGGCCAGT 
4 921 GCCAAGCTTT ACACTTTATG CTTCCGGCTC GTATGTTGTG TGGAATTGTG AGCGGATAAC 
4 981 AATTTCACAC AGGAAACAGC TATGACCATG ATTACGAATT CGGCG CAGC A CCATGGCCTG 
5041 AAATAACCTC TGAAAGAGGA ACTTGGTTAG GTACCTTCTG AGGCGGAAAG AACCAGCTGT 
5101 GGAATGTGTG TCAGTTAGGG TGTGGAAAGT CCCCAGGCTC CCCAGCAGGC AGAAGTATGC 
5161 AAAGCATGCA TCTCAATTAG TCAGCAACCA GGTGTGGAAA GTCCCCAGGC TCCCCAGCAG 
5221 GCAGAAGTAT GCAAAGCATG CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC 
52 81 CGCCCATCCC GCCCCTAACT CCGCCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA 
5341 TTTTTTTTAT TTATGCAGAG GCCGAGGCCG CCTCGGCCTC TGAGCTATTC CAGAAGTAGT 
5401 GAGGAGGCTC GAGGAGCTTG G 
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Figure 23. The nucleic acid sequence of the Lama2/APPA transge ne used for the generation 
of transgenic mice and transgenic pigs (SEP ID NO: 7) 



LOCUS 



transgene 17732 bp DNA SYN 14-APR-2000 

DEFINITION Lama-appA cut Xhol - . 20623 to NotI 17732 
ACCESSION transgene 

KEYWORDS parotid secretory protein; acid glucose- 1-phosphatase ; appA 
gene; 

periplasmic phosphoanhydride phosphohydrolase ; artificial 
sequence; 
cloning vector 
REFERENCE 1 (bases 1 to 17732) 

AUTHORS Golovan, S-, Forsberg, C.W., Phillips, J. 

JOURNAL Unpublished. 



FEATURES 

DEFINITION M. rnusculus Psp gene for parotid secretory protein. 
ACCESSION X68699 
VERSION X68699.1 GL53809 
SOURCE house mouse. 

ORGANISM Mus rnusculus 

Eukaryota; Metazoa; Chordata ; Craniata; Vertebrata; Mammalia; 

Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

REFERENCE 1 (bases 3777 to 5332;) 

AUTHORS Svendsen,P., Laursen,J. ( Krogh-Pedersen, H . and Hjorth,J.P. 
TITLE Novel salivary gland specific binding elements located in 

the PSP proximal enhancer core 
JOURNAL Nucleic Acids Res. 26 (11), 2761-2770 (1998) 
MEDLINE 98256451 

REFERENCE 2 (bases 7147 to 12653; 13952 to 17731) 
AUTHORS Mikkelsen, T . R . 
TITLE Direct Submission 

JOURNAL Submitted (07 -OCT- 1992 ) T.R. Mikkelsen, Department of 

Molecular Biology, University of Aarhus, CF Mollers Alle 
130, 8000 Aarhus, DENMARK 

REFERENCE 3 (bases 7147 to 12653; 13952 to 17731) 

AUTHORS Laursen J , Hjorth JP . 

TITLE A cassette for high-level expression in the mouse salivary 
glands . 

JOURNAL Gene 1997 Oct 1 ; 198 ( 1-2) : 367-72 
MEDLINE 9370303 



FEATURES 



Location/Qualifiers 
source l.to 12653; 13952 to 17731 

/organism="Mus rnusculus" 
/strain="C3H/As " 
/db_xref = " t axon : 1 0 0 9 0 11 
/ chromosome= " 2 " 

/map="Estimate : 69 cM from centromere" 
/clone = "Lambda YP1 , Lambda YP3 , Lambda YP7" 

/ c 1 one 1 ib= M Lambda - PHAGE (Lambda L47.1)" 

/germline 

/note= "Allele : b" 



misc_f eature 3777-5332 
/gene= "PSP" 

/function= "salivary gland specific positive acting 
regulatory region" 
enhancer 7147 . . E724 
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Figure 23 (continued): 

/ evidence =experimental 
exon 11778.. 11824 

/gene="Psp " 
/note="exon a" 
/ number =1 

/evidence=experimental 
exon 12626 . . 14190 

/gene= n Psp" 

/note="exon b fused with exons h and i" 
misc_ feature 12644-12652 

/function^" consensus sequence for initiation in higher 

eukaryotes " 
misc_f eature 13952-13 965 

/function=" M13mpl8 poly-linker 1 ' 



DEFINITION E . coli periplasmic phosphoanhydride phosphohydrolase (appA) 
gene, 

ACCESSION M5870B L03370 L03371 L03372 L03373 L03374 L03375 

VERSION M58708.1 GI : 145283 

SOURCE Escherichia coli DNA. 

ORGANISM Escherichia coli 

Bacteria; Proteobacteria; gamma subdivision; 
Enterobacteriaceae ; 

Escherichia . 



REFERENCE 1 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 

FEATURES 

Source 



s ig — peptide 
/gene= " appA" 
CDS 



(bases 12653 13 951 ) 

Dassa,J-, Marck,C. and Boguet , P . L . 

The complete nucleotide sequence of the Escherichia coli 
gene appA reveals significant homology between pH 2.5 
acid phosphatase and glucose- 1 -phosphatase 

J. Bacterid.' 172 (9), 5497-5500 (1990) 

90368616 



Location/Qualifiers 

12653 . .13951 
/organism=" Escherichia coli" 
/ db_xref = t axon : 5 62 " 
12653 . . 12718 



12653 13951 
/gene= " appA" 

/ standard_name = acid phosphatase /phytase " 
/trans l_table= 11 

/product= "periplasmic phosphoanhydride 
phosphohydrolase " 
/protein_id= ,, AAA7208 6 . 1" 
/db__xref ="GI : 145285" 

/ trans lation="MKAILIPFLSLLIPLTPQSAFAQSEPELKLESVVIVSRHGVRAP 

TKATQLMQDVTPDAWPTWPVKLGWLTPRGGELIAYLGHYQRQRLVADGLLAKKGCPQS 

GQVAI IADVDERTRKTGEAFAAGLAPDCAITVHTQADTSSPDPLFNPLKTGVCQLDNA 

I^/TDAILSRAGGSIADFTGHRQTAFRELERVLNFPQSNLCLKREKQDESCSLTQALPS 
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Figure 23 (continued): 

ELKVSADIWSLTGAVSIASMLTEIFLIiQQAQGMPEPGWGRITDSHQWNTLLSLHNAQF 
YXjLQRTFEVARSRATPLLDLIKTAI/TPHPPQKQAYGV^ 

ALEXJ^TLPGQPDNTPPGGELVFERVTORLSDNSQWIQVSLVFQTliQQMRDKTP 

PPGEVKLTLAGCEERNAQGMCSLAGFTQIVNEARIPACSL" 
mat_peptide 12 719 13 94 8 

/genes "appA" 

/product= "periplasmic phosphoanhydride 
phosphohydrolase " 

mutation replace (12659 . . 12661, "gcg changed to gcc") 
/gene="appA M 

/standard_name== "A3 mutant" 

/note=" created by site directed mutagenesis" 
/citation= [3] 

/phenotypes" silent mutation 1 ' 
mutation replace (13934 13 93 6 , " ccg changed to ccc") 
/gene=" appA" 

/standard_name=" P42 8 mutant" 

/note= n created by site directed mutagenesis" 
/citations [3] 

/phenotype=" silent mutation n 
mutation replace (13 937 13939, " gcg changed to get") 
/gene="appA" 

/standard__name=" A4 2 9 mutant" 

/note=" created by site directed mutagenesis" 
/citation= [3] 

/phenotype= M silent mutation " 



BASE COUNT 4719 a 4125 c 4168 g 4719 t 

ORIGIN ^ TCGAGAGTAT CTTTGTCAGC TGTGCCTCCA ACAAAGGGGT ACTGTTGCCC ACATAGAAAG 
61 ATCTAAACTA ATTAATTAAT CCCTCACCCG CAAATCTTTC AGTCACTAAG TTAGCACGAT 
121 TGTTGAACAA GTTCTCCAAA GGAGAGATAC AGATGAGTGC GTATAGGGTG GACCTGGCTG 
181 CTGAGGAGAC ACCTGCATCT GACTAAGAAG AGCCACGGTG TTAGTTGAAT GGTGTGGAGT 
241 AGGGTGGTTP TGTGGGACAG TAGAAAATCG AGAGGCATGT GCCGTTTAGT GAACTGATGG 
301 AAGCTACCCC AAACGACAGA GATTGTCAGT CAGGCCAATC CGTTTCGAGT TTGATGGGCA 
361 GCCGGACAGT GAGACAGACA CACCTACTCA GTTGGAGGAA GGATGAGAAC AATGGCCAGC 
421 AGGGATTGAG AGACCCTGAC AGGCGCAAGG CCCTAACACA CACACCTACC ACCTCACTTG 
4 81 ACAAAGCTGC CAAAGACCAA AGACTTGTTC TCCATTAGAA ATGACAGCTG GCTTGACCCG 
5*1 ACAGCATAAT AAGCAGAGTG TACTCTGATT GGAGAACTTT AATGTGTTTC ATTCAGTATT 
6 01 ATAAAAGGAC AGTATTACAG ATTTTGTTGT ACACTGCTGT TACATGTGGG GCAGTGTGTC 

6 61 TTTAAGTAGG GTAAAGTACT CTTTAAAAAT GGGTCCTAGA TATTTTTTCC TTTAACTCAA 

7 21 GTCTCTTACT GTTTAAATGA TTTTTATTTT GTTTAATATG GAGGAAAAAG AAGCGTAAAT 
7 81 GGACAATATA TATTTAGAGA AAGATGGTTA GCTGTCAGAA AAATATGCAA ATCAAAATCA 
84 1 CACCAAGACT GCAGCACACC CCTGTCAGAT GGCTGTGATC AAGAAAATAA ATGACAATGA 
901 GTGGTGGTGA AGATGTACTA AAGGGAAACA CACACACACA CACACACACA CACACACACA 
961 C AC AC TGG AG CAACCACTGT GGAAATCAGT ATGAATGGTC CTCAAAAACC TGAAGATAGA 

1021 GCGGGGCGTG GTGGCATACA CTTTTATTCC CAGCACTGGG GAGGCAGAGG CAGGTGGATC 
108 1 TCTGAGTTCC AGGCCAGCCT GGTCTATAGC ACAGGTTCTA GGACAGCCAG GGCTACACAG 
114 1 AAAAACCCTG CCTTGATTAA ACCAAACCAA AC CAAACCAA ACCAAACCAA ACCAAACCAA 
120 i ACCAAACCAA ACCAAACCAG ACCAAACCAA AACACTGAAG ATAGAACTTC AGTATTCCAT 
1261 T C C T AG AT AT ATACCCAATG GAGACTAAGT CAGCAAGACA CCTGCACAGC C ATGTT C ACT 
1321 ACTACACTGT TCACCACAGC CAGGCTGTGG AACCAGCCTG AGTSTCCATG ATAAATGAAT 
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Figure 23 (continued): 

1381 GGATAGGTAA CTTTCAAGGT AAATGGACTC TGCTGTGTAC ATGCCTCACA TTCTGTTTAT 
1441 TCATTTTTCT TTATGAGGTG TCCATTCAGG AGTCACATGG TAGTTCTATT TTCAGTCTTC 
1501 TGAAGATACT ACACTGGTCC CCACAGTTTA CACTTTTATC AGCAGTGAAT AAGGGTTCCT 
1561 CTATCCTTAC CATCATTTGT TGTAATTTTT CTTGATGACC CTCTTTCTGA CAGGGATAGG 
1621 ATGTAATATC AGTGTGAGGA AGTACAACTT GTTTTCTAAG TATTTATTGG CCCCTTGCAT 
1681 TTCTTCTTTT GAAAACTGTC GGTTC CTGAC ATCTGCTCAG GTATTCATTG GATGTTGTTT 
1741 CTTTGGTGTT TGAGTTCTTA TGAATTCTAG ATGTTAAATC CCTGCCTGTG GTTCTCTCCC 
1801 ATTCTGTAGG CTGCCTCCTC ACCCTGGCAA TTGTTGTCCT TGTTTTGCAG AAACTTTTGA 
1861 CTTCATGGAA TCTCATTTGT CAGTTTTCCC TCCTCTGCTA TAGCCTGAGC TAATGCACTG 
"1921 GTTTTTACAG AGCCCTGGTC TATGCCTTTA TCCTCCTCTG GCAG CTTCGG AGTTTCATTT 
1981 CTTACATTTA GATCTTTGAT CCACTTTGAA CAAGTTTTGG AGCAGGGTGA GAGATACGAA 
2041 TCTAGTTCCA TTCTTCCATA TGTGATCCTA GTTTACATAG CATCGTTGGT TGAAGAGGTT 
2101 TTATTTTATT TTTAAATAAT GTGTCATAAA AAACGAGGTG GTTGTAGCAG TGTGGATTTG 
2161 TTTCTTTGTC CTTTGATCTA CAGGTCTTGT TTTGTGTCAG TCTCATGATG TTTTATTGCT 
2221 ATGGCTCTGT CATACAGTCT GAGGTCAGGT ATTGTGATAT ACCTTCAGTA TTGCTCCCTC 
2281 AGACTCAGGT TTGCTTTGGC CAGGAGTCAT CTTACTCAGT GCTCTTAGAG CTCCCCCAGC 
2341 ATGTAGCTGC TACTATTCTT AGTTGATAAA TCAGGAAACT GGGGCTCAGA GAGATTAACT 
2401 GTCTTGAACT ACTTCTGGGG AGGTGAAACG TGGAGACACT AAACTGTGTT TACCCTGTAC 
2461 TGCTCCAGTA GCTGTCGGGT GCTGGGCTAC AGCAAAGCAC CTATACTATA TATTACTCAG 
2521 GAGGTGGAAA AACTCAGCCT CCCTTGGGGT TCCCAAGCTC CCAGGTGTCC AGTCACTGCT 
2581 GGAAACCTCA TGGAGTCTGA AAGGAAGGGT TGAGGGTACA TGGGGCAGCG ATGAGGAGCC 
2641 TGGGGCTGGG ATCTCCCAAA CACCTGGATA TCCAGATGCC ACTGGGTCAG GGGGAGTTGG 
2 701 GAACAGAGTT GGGATGTCCA TGGACCTGTG ACAAGGCCAG GGCCAGGGGG AGGATAACTC 
2761 TGGCTTTACT AATTTGCGAA AGTCCTTAGC TTAGCAGCAG TTGTCTGGGA GCACAGAGGG 
2 821 GCCTTCTGTA AGAGGCTCAG GCAGTGCCGC TCTGTAGGCG AAGGTCTTCT CCATGTTCCC 
2 881 CATGGTGGTT CTTGATGAAA GAGACAGTCC TTGGCTCCAA ACTGGTTTAT TGATTGTTCA 

2 941 TTGTGGAAAA TGGGTGCACA CCACCTTCTC AGGGTGGACC AGAGATCAAA TACCTTTTGC . 

3 001 AGGGAGGAAT ATCTGGGAAG GGACGCTTAC TGGCTAAACC CTCAGGGCCT CTAGATACAT 
3 061 CATTAGCATG GAGAACTCTG TTCTGGGCTA CATGACCACA GGCCACATTT CCACAAGCCA 
3121 CATGTGGGAA GTGTGG CACA TGTTCTAGGC CAGGAATCTG GTAGGGAGCG TGGAGCCACC 
3181 TACCATCCCA GGTGGGTGCC TGGGTGCCAG GGACCCTGAA CCCGCTCAAC CTTACCAAGT 
3 241 TTCCTGGCAG GGTCCACTGT CCTACACAGA AGCTGGAGGA GGTGTGAGGG TTGTGTCTTT 
3301 GTGGAATGTC CCATGCTGCT TGGGGCTCAG TTTCTCCACC TGTACCTCAT TGGTTTGGGT 
3361 ATAAAAAGTG GGGATACTTT ATTATTCTCT GACTCGGTCC TGAGGAAAAA GCATCGTGGC 
34 21 AGTCCAGGAA CCACACCCTG AGGTTCCTGC ACTGAAGGGA CTCCCTAAGT CTCTGGAGTC 
34 81 TCTCCCCTTC ACAGAGCTGC CAAAGTCTAG GTTCTTTTGA GGATAACAGA GCCATGCTTG 
3541 GTAAGCAGAC AACAGCATTT GTTTACTCAA CCTTCTTTTG TCAGCTCCCT CTTCATAAAC 
3601 AAGTTGAGAC ACCATGCTGG CTTGAGGAAG ACTTCTAAAG CCAGACAACT GTGCAAGGAA 
3 6 SI GAAGAAGAAG GGGCAAGTGG AGTTAGCCTG GATGTAGCCC TCAAAGTCTC CAGAGACCAG 
3721 CCATGAAGGC TCAAGTGGAG GGCAAGACCT GCAGCAGCCA AGCATCTGGC AGGAGAGGAT 
37 81 CCTGGGAACC CCTCTACCAT GACACACATT CTTCCTGCAG GTCACACTTA ATAGGC C ATT 
3 841 TCTTATTTGG ATCTATCATG GTGTTCTGTG CGAGATTAAT GAGGTGTTAT GCTGCGAACA 
3901 GAAAGTTATA TAAAAACAAG TCCCCCCCCC TTGTCACTGC TGCTAAGAAT GTAGCAGAAA 
3961 TTGTCTCAAG TGTCTCTCTA ATCAGAAACA AT AAAGGTC T CCTTGGATTC AAGCCCTCCA 
40 21 GTTTCCTCCT TC CTTGCTG A GCCTTGGACA CCCATACAAA CCTCCTGGAT GCTACAGCTC 
4081 TGGGCAGAGA CTCCAAGGTG GGGAGAGACT GATGGTACAA AAG C AAAAT A CTTGTTTGGG 
4141 GGTACACCCA CTCCTCTGCC TGTGTGGTTC CTGCAGTCAG TCCTGCAGAC AGGCCCTCAG 
42 01 TGGGTCTTCC ATGGGCAACA CGCAGAGGGA GGCAATGGAT GGGAATACCC ACACCCTGGT 

42 61 TAGTTTACCC CGGCCATGCT CTCTGCTCTT CATCCCTCCT CTGCCCTCTG CCACGGCTTT 

43 21 CTCTGCAGGA ATCATATCTT CATATTGGCC CACAGGTGTT CTCCTCACCC TAGCTATGAT 
43 81 GTTTACTTTA GAGTGACCTT AGCAGGGCTG GTGGGAATGA GTTCTAGAAG GCTCACGGAG 
4441 ATGCTAGGGA AGAAACGTCT TCTAACTACT GAGGTTACTA AGTTCCTGGT GGTTGTCTCT 
^5 01 GCCTTTCCCT TGTTAAAGTC ACCTTGAAGT TAGTGCAGAA GAAATCAGAG CCCAGTCACA 
Is 61 GAGTAAATAT GGTCCTGAAG ATTTCCTTTG AGTGCCCAGA ATCCATGACA TTTCAAGAGC 
4621 CCTCTTTGTA CCTTAAGTCA TTTGGGGTTG TATCTTCTGC TTGATGTATG TGTGTGTGTT 
4681 TATCAAAGAG TGAGATGGTT ACATAAGAGG TGCTCTAAAG GACAGAGAGG ATTTGCAATT 
4741 GTGGCATGTG ACATCCTCAG GCCTTC-CTCT GGTGCCAGGA GGAACTGATG CAGAAAAGAG 
4 801 TAAGAGGTCA TTTCCTGGAG GCTGTCACTA TAGAGGAGAT CTTACAGTGC ATTCCCTCCT 
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Figure 23 (continued): 

4 861 CCAGGCCCTG CCTGAGGATA GACATGTGCT GACTGCAACT GAAACAGAGG CTTGGGATGG 
4 921 AGAGTTAGGT TCACAGAAGG GAGGGTGGGA GATGGATGCT TGCTGGGTTC TGGGTCTCAT 

4 981 CACCAGCTCC TGACCACCCG GTCAGCCCAT GTGCTTATTC CATAGCTTTC TTTTGCTATG 

5 041 TTTACTCAGT GTGGTGTTTG TTGGGACCCA GCAGAAGCCA GTCCCAGGCT GACAGCTGTG 
5101 GATACACAGG GCAGCATGAG GGTCCTCAGC CTGAAGCAGT CAGGCTGGCA GAAGAGAAAG 
5161 ACCAGCACAC ATTCCTTCAA CCAACTATGT CTTGAAAAAC AAACATATTA TATCACATAT 
5221 ATTGCATTTA TGAGACAGCT AAAATGTACT CGGGTAGCAT GACTCCAGGT GGGGATATCT 
5281 GCAAGTG C C A TGAGTGG CAG AGGGACAGCC AATGTGAGGC AAGAAGGAAT TCTGGCTCAA 
5 341 CACAGCTTAG CTC CCTGGTG TTGGTTCAAA CTTTGAGAGT TTGACCACAA GCACTTTATT 
5401 TTTGACATAT TTAAACAGAG CACAACTTTG GGAAAAAGTT TTCTTATGAA AATTATCACA 
5461 ATAAAGCTTA AGGCATGACT ACATTAAAAT GCCTTTGCAA AGTATATGTG CCCTCTTCCA 
5 521 CAAGAATGGT TCTATTGACT GAGAAATAAT GTTCAGGATA AAGATCCAGG AAGAAAAGAT 
55 81 CAGGGATAAG TAAAATACTA AACTCTTTTG CAAAGTACAT AGACCCTCTT TCATAACAAT 
5641 GGGTTCTATT GACTGACAAG CACTGCTCAG GAGTTGGGAA AG AGTCTAG C ATAAGCACGA 
57 01 TAGCCTGGAG ACTCTAG TG A GGTCTAGTCT TACAGACAGC AAAAATCACC AGGTTACAAA 
57 61 CTACATTCAT TTCCAGTTTT CTGATCAGGC ACAGGTATGA ATCCCTTCTG TTGAAGAGAA 
5 821 AAGTCCATGT GTTTAAAATA TCTGGTTTCT CCAGTGCTAT TAGCGAGAAG ACTTGAGCCC 
5B81 TATACAACTC CCACCTGGAG TGACATCCTG TCTTCATGGT AT AT T AC AT A CCTAGACACG 
5 941 CTCATCTCAC AGACTTAGGA CTTTGTCTTC TGATCTCCAT TTCTGATCCC ACTTCCACCT 
60 01 TTG CCTTG AT AGTGTCATTT TCTTCACTGC CTTGGTGACA AC C ATGTT AT CCTCTGTGTA 
60 61 TTTGAGTGTT ACCATTTTCA GATTTTACCT GTATGCAAGA TCACACAGTC TTTGTCTTTC 
6121 TGTCTGGATG CATGCTAATC TCTACACAAC AACCCTTCCC CGTCACTCAG ATCTTCCTCC 
6181 ATTAACACAT ACATGGTGCT GAAGAGGCTA GGGAGCTTCC CTTCAGTGGG GAGCTAG CTG 
6241 GCTATTGGGC CTTTTTGACT GTCCAGGAAG GCCCCCAATT GCTGAGACAA G AACT TAG AT 
63 01 TCTTCATTAT TGACTCTAAC TCATGTATCA AGCAGAAGCT AATGAATAGT TATCAACAGG 

63 61 ATCAGAGGTT CCAGTGTAAG ACACTTTGAC ATGAAAGAAC GGAGGAAGGA CAGATGGATG 
6421 CATAAAAGCA GGACCACTGC CCCAGGAAGG TCCTGGAAAC TGATGCAGGG CAAAGGACAG 

64 81 GTTATAAACC AAATCTTAGG GAGTCAGGAA GAGCACAGAG GAGCTCAACC AACTGACCAC 
6541 TGCTTAGGGG CTACCAACCC AATCCTCCCT GTGGGAACAG CTAAGCTATC AGCCAAGGGT 

66 01 AATAAACAGG CAGGACCTGT GGATGACATG GAGAGCATAG GGACCCTGGG TCCAGCCTTT 
6661 AGCACCTGCA CTCTCAGGAT ACTCCACCAT TGTGTCTTAG AG AGC CTAGG GATACTGGGT 
6721 CCAGCCTTTG GTACCTTCAC TCTCAGGGTA CCCCATCACT GTGTCTTGGA GAGCCTAGGC 

67 81 ACCCTGGGTC CAGCCTTCAG TACCTGCGCT CTCAGGACAC CCCACCATTG TCTCTTGCCC 
6841 CGTCTCTTCT TCCTCTTCCT CCCTTTCATT GTCTCTTCTC TGTTTCTTTC TTGACTCTCC 
6901 TTTCCCCTCA CACCCTCACT CTAGTTCTCC CCTTCCCTCT CTGCATCACC CTATTCTCTC 
6961 TGTGGTCCCT CCACTTTCCT TTATCTCTCA TGCTTCTCTC CTCCCTCAAA TACTTGTCAC 
7 021 CCACTATACT TCAGGGGCCA GCTCTAGTGA CAAAGCTGTT AATAGCAAGA CTCTCAGATC 
7 081 TCCAACGGCT CAGAGGAGCC AGACCCACCA AGAACTCTCT CCAGGTCCAA TTTCAGGTTC 
7141 CTTCGAAAGC TTTC AG CAAA TGCTCAGGGA ACATGCCACT AACAAGAAGA TGCAAATTCC 

72 01 AGTTGAGAGT GGGAAAGGCC CTTGCGTAGG TCCCATCTTC CAGGCCAAGG TCAGAGGGGC 
7261 TCTGTGTAAT CCGGATTGAC AGGGCTCAGA AC AATGT TTT GTTTTTAAGG TTTATTTATT 
7321 TTAGGTGTTA GTGTCTTTGC TTGCATGACC TTATGTGCAT CATGTGTGTG CAGGTTCCTG 

73 81 ATGACAGTAG AGGAGGGCTT TGAATCCCTG GGGATAGGAA GTTACAGGAA ATTATAAGCT 
7441 GCTTTGTGGG TCTTCTAGCT TTCCCAACAG AAGTGAATGC TCTTCACCAC TGAGCCATCT 
7 501 CTCTAGGCCC AAGAGACATT GCTTTATGGA TATAATTGTG TGTGTGTGTC AACATTGAGG 
7561 AAAGGGAAAT AAAAAAAAAA CTTCAGCCGC TAAGGTTGTA CAGTTTCACT AATTGCTACT 
7 621 TTTAGTTGTG AT AAAATGG C AGGTGCTTCA ACATTTATAT ATACAAAAAC TTCCCTGCTG 
7 6 81 GTGGTTCAAC TGTGAGAACT GGGGTAAGTG GGTGAGTTCT CTTTTTCTGT CTCTGTCTCT 
77 41 GTCTCTCTCC TTCCATTCTT TCTTAAAGGA AATAAACATT GCAGCTGGGT TATAGCTCAT 
7 8 01 CAATATGGAA GTTACAGAAG TGAAAAAAGG CATTG CCTTG GTGGGTGGTG TTACCAGCTG 
7 8 61 ATTTTTGGTT GTCCTGCAAG GAGGTCTGGG GACTGGCTGC TCTGTCTCTG TCTGTATGAG 
7921 TGAGGGAAGT CTGGGG AG C A GATTCCCTAA CCTTCAGCCT GGC CTGGTTC CTGAGTGAAC 
79 81 CCAGCCTCTC TGGTCCTAGT AGCTTTTTCC AAACAGGAAT CTGAGTGGTG ACAGGGAACA 
8041 AGTACCAGCC CATTGCTTAA GTGCCAGGGT TAGTGAGGGC AGGAAGCTGC CATAGCTGGG 
8101 ATTAGTAGTT GTATTGGATG TAGGAAGTCC TATCCTGGGA CAGCTAATCC TTAATGCTTC 
8161 ACTGGAGATT TTCAATGAGA AATTTATCCC ACGGCCCATA TGGCCCCATC CTTTTGTCTC 
82 21 CAACAGCCAA GTATTTTCCA TTAGAGGAGA CTTC CTGTAC ACTTGATGGA TGCTCATTCC 
82 SI AAGGTGACTT GGGGCAGTCA GTACAGACTT GGGATGACCT CTGACAGGCT AACCTCTCCC 
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Figure 23 (continued): 

83 41 CAACAAGGGC CCTCTATGTT TGCTATGTAA TGTAATGTCA GACATTGTCA GGAGTGTCCG 

84 01 CAGCACAGCC TGCCCAGTGT GAGGGCTCTC ATAGGTTTCC CACTGTCTTA TCTACACAGG 

84 61 GATAACGAGG AGGTAAGCTG CAGTTCCCAG TCTCACTTCA CAGAGGAAGA GATAACCCCA 
8521 TCCCAGGTCA TGTAGCCAGC AGTGGAAAGA ATGAGGATTT GAACTCAGGT CTTCCAAGTC 

85 81 CCATTGATAG CATCTCCTCA CAAGTCCCTT GCCACCCTCA CGATGCCTTA G AC ACTTGC C 
8641 TGCCCTTTAT ACTAAGGAGA TG CAGGTAC A AGGGGTTTAC CCATGTAGCA GCTGAGGCAG 
87 01 CTGGGGATAG AT AC C AG CAG CAGGCCTGAT GTCACCACTC TAACTCCAGC ATCCCCAGTC 
87 61 TGTGTTCCTG G AGTG TG AAA ATCCCTACTT AACAAGATTG TGCAACAGTC CTTGGCTCTG 
8821 TGACCCATAG CTGGAAACAG GATTCTCATT GATTTGTGGA ACATGGTGGC AGCCAGCCAA 
8881 AAAGAGGGTC TGCATACAGA AGACACGTGT GGCAAGGCCA CAGCAGACTC TGACTACCTT 

8 941 AG CTTAC AGA ATTACAAGGT CATAATGTCC TCTGCTTTGG TCACCTCATG TTAAGGACAG 
9001 GCCCTAATGA AGATGGGGCA GAAGACTGAA GGAATGGCCA ACCAATAACT GGCCCAACTT 
9061 GAGACCCATC CTACAGGCAA GCATCAATTC CTGACACTAC TAATGATACT CTGTTATGCT 
9121 TGCAGACAGA AGC CTAG CAT AACTATCCTC CGAGAGGTCC ACCCAGCAAC TGACTGAAAC 
9181 AGAAAAAGAT ATCCACAGGC AAACAGTGGA TGGAGGTCAG GGACTATTAT GGGAGAG CTG 
9241 TGGGAAGGAT TAAAAACCCT GAAGGGGATA GGAACCCCAC AGGAAGACCA ACAGAGTCAA 
9301 CTAAG AG AC C TGTGGGAGCT CT CAG AG ACT GAGCCACCAA CCAAAGAGCA TACACAGGCC 

93 61 GGTCCGAGGC ACCTGGCACG TGTGAAGCAG ACATGCAGCT CAGTCTCCAT GTAGGTCCTC 

94 21 CAATAAGCGG TAGCCTGACT GCAGTATCCA ATCCCCAACA GGGCTGCATA GTCTGGCCTC 
94 81 AGTG GGGG AG GATGCCCCTA ATCCTGCAGA GACTTGATGA GTGGAGAGCT ATCCAGGGGG 

9 541 AACCCACCCT CTCTGAGAAG GGAATGGGGA TGGGGGAGGG ACTCTGTGAA GAGGGGACAA 
9 601 GGACAAACAA GAACCTCAAA TAGGTCAGGC CCTAAAGGCT TGCTAAGTAG CAGTGGCCCA 
9661 GCTCTGTCCT GTTCCTCAGC CCAAGGCTCA GCTCCCACCT GTTTCTGTGT TTTTCTGGCT 
9721 TTTCATGGGC CTAGGACTTG GTGACCAGTT CAAACAATGG GGCCTGTGGA AGACACAATA 
97 81 TACAAGACTA GGGACATTCC TGTTCTGCTG ACTATCCATA G C CTGATGTA GGTGGAAGGA 
9841 CCCAATCACT GGATTTCTAC CCTTGCACAA CCTTGACAGC TGAGGGCCTC TCAGAAACCT 
9901 ATTTCTTCCA CTGAAAAATG AGACTCTCAA ATGAACGTCG TGACAATCAT CAGGCTTATT 
9961 AAAGAGGTGT ATCTAACCTG AATGGCAAGC AGACAGCAGG CAAATGTCTG TATCAACCTC 

10021 TAGGAAGGAC AAGAACTGCT CACTGCTGCC CCCCAGGAGG CCATTTGCTG AAACAGCTGC 
100 81 TCTCCTGCTG GTGCACAGGC CCTGCCTTCT CATTGCAGCC ACAGCCCCTT CCTGTCTGAA 
10141 CCTCCTGTCA GGTCACTGGG AAACAGATCA AGATGGAACA GGACAGCTCC TGATGGTAAA 
102 01 TAAAAAACAG TGGTCATGGC TATTCATAGG GGTTTATGCT TCTTCAGTCC ACACTGTGAA 

102 61 GAGCTGTGGG CATGAACCAC AGTGTTCGAG GTAGAGTTGG GGTTCTGAAA TTCACAGTGG 

103 21 GGTGAGCTCA GTAAATGTGA GCTGGAGGTC ACTCGTGAGA CACACAGTCC TGCTGCTTCT 
103 81 GTTCCCAATA TCCTGAGGAG ACGACACATC TACTTTGTTC AGAGGCCACA GTCTAGTTGA 
10441 CCTGAGAGTT ACCAGTTTCT TATTTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG 
105 01 TGTTGTTCGT GTGTG AGTG C AGGTGCACAT ATGATAGCGT ACACGTTGAG GTCAGAGGAT 
105 61 AACTATCAGG CGTTGTCCCC TCCTACTTTT CCTCGGACTC TGGAGAACAA ACATGGGTCC 
10621 TTATTCCAGG GGAGCAAGTC GCTGTTGGCT GACACATCTT GCTCACATAC ATTTTACCTA 
10681 GACAATGGAG CCTCCATCAG AGTATTACTT TAGCTCCTCA CCGATGGCAA TGCACCACCT 
10741 CTCTACCCAC ATAGGAGTTG GGTCTCCACA CACCCCCACA CCCCCTTCAC CAAAACGTTT 
108 01 TCAGTTACTT TATCTGGTAA AGTTCATCAG AGAATGAAGC CAGTATTAAG AACATGGAAT 
108 61 CAT TTGGG AA CCTGGATCTA GCAATACCCC ACCCTAGATG GAGTTGCTGA GTTTTCACCT 
10921 CAGATTATAA TTCCCCCCTA GCTTCTATGG TTTATTCTGA AACCAGGGGA ACTCGATTCC 
10981 TCCCTTTGGA CCACAGACAT CCTGGCTTGT G AATT C AC AT GTCATCTACT GCTAATCCAT 
11041 TGGTAGTATG TGGCTCACAG AGACACACTA CAGTCATGGC CAATGTCAAG GTAGGACAGA 
11101 TGTGAATCAT TCCCCCAGTC CTGCTGTTTT CATGACTAAC CCTCCTCAGC ACAGTGACCA 
11161 TGAAC C TACT TTTCCCCTCC TTTTATTTTT AGAATTGCTG GAATTTTCTA TTTTGAGAAA 
11221 TAATAGCCTT GGGCAGCATT AAACAAAATC AT CT AG AAAG CTGGTTTAAA ATACAGATGG 
11281 TTGAGTCAGT GAAAGAGTGA GGAATGTCAT TATTGGCCCC T C AC AGAGGC TGGCTCACTC 

113 41 CAG C AG AGGT GGTTGAAGCT CTTGGACACG GGTCAGGTGC ATAGGAAAGG TNGTCTGGGA 

114 01 CACTGAGAAC CACAATTGAA CAAACAGAAC TGTTGGCTTT TTTTTTTTTA AATGAGTTCT 

114 61 CAAAAAATGA CTGGCTAGCT TAG GC AAATA CTTCGAGCCA ACCCAACAGA ACATTCTTCC 
11521 ATTGATTCAT TCTGGATCTT CTTTCTAGAC AATACTGAAC TGACCCCTTG TTGGCAGTCT 

115 81 CAAGTTTGAC AACATAGGGC TTTGAACTTG GCACAAGGTC C AT C AC TG TC ACCCAAGCAT 
11641 CCTGGGTGAC CTTTGGGTTG GAATATCTTG GCTAACCTTA GATATTTTCT TTGGAGTATC 
117 01 TTTAGAACAT C CAG G AAATA GGGCTTGATT CTCATCCTGG GACCACAATA TAAGTCACCC 
11761 TAGAATCCCA GGAGATCGTG CAGAGAAACA AGGATCTCTC TCGTGTGCfeC CCTTCTTCAA. 
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Figure 23 (continued): 

11821 AGCAGTGAGT AGTGACTCCA CTAAACTGAG TTCCCATCTG AGAGTCCACA GGAGG CTTTG 
118 81 GGGCAAGAAG CAGAGGGAAG GCACTGTTTG TGTTGGTAAA GTTTTGACTC TAACAAATTT 
11941 GAAGACATAG ATGACATTGT GTCAGACTAA CAACAACCTA GACTCATGTG GGTTCTGTTT 
12001 AGG G ATC AG A TTTTATTCAT CAATGACTTG TCTTAGTGTA TAGAGAAAGG CTTCCTACTG 
12 061 GAGTGTAGGC TCAATAATGA CAGAAGAGAT AG CT ATTTCC CCTAGGGACT GTGCTGCTCC 
12121 AAGTTTGGTG GAGAAAGGCA GTGGGGAACC TAG ATG TGCT CTCTGGGGAG GGGGTCTGAA 
12181 GCTGGCTTCA TAGAAGGTGT GAAGTTTTGC TGAAACATCT AAACAGAATT ATAGCTTAGG 
12241 AAAGTGAGCA GGCAAGG C AG GGAATGTGTT GCATATGTAT ATGTACATGA ATATATTATG 
123 01 TTATAGATAC ACACACATTT GAACCTCATT TGCAGATGAC AGAAAATAGG TTATTTTGCC 
123 61 TCTCTTAACT GCTAAGCACA ATGACTTCCA GTTCCATCCA TTTCCTGAAA TGCCACAATT 
12421 TCATTTTTCA TTGTGGCTGA ATAAAATTCC ATTGCAGACT GGGCCCTACT TCATCCACTC 
12481 CTGAGGGCAG GCATATCCCC TGGCTCCATT TCTTACCTAT TGTGAAGAGA AGTGCAACTG 
12 541 TCTTGTTGAA AGGCAAGCGT GAGAGAGGCA GGCACTAATT GTGGGTTTTT GTTTCXJCTT 
12 601 CCTGCTATGA CTCTCGATTT GTCAGAACCA AAGATCGATA AAAGCCGCCA CCATGAAAGC 
12 661 CAT CTTAATC CCATTTTTAT CTCTTCTGAT TCCGTTAACC CCGCAATCTG CATTCG CTC A 
12 721 GAGTGAGCCG GAGCTGAAGC TGGAAAGTGT GGTGATTGTC AGTCGTCATG GTGTGCGTGC 
12 7 81 TCCAACCAAG GCCACGCAAC TGATGCAGGA TGTCACCCCA GACGCATGGC CAACCTGGCC 
12 841 GGTAAAACTG GGTTGGCTGA CACCGCGCGG TGGTGAGCTA ATCG CCTATC TCGGACATTA 
12 901 CCAACGCCAG CGTCTGGTAG CCGACGGATT GCTGGCGAAA AAGGGCTGCC CGCAGTCTGG 

12 961 TCAGGTCGCG ATTATTGCTG ATGTCGACGA GCGTACCCGT AAAACAGGCG AAGCCTTCGC 

13 021 CGCCGGGCTG G C AC CTG ACT GTGCAATAAC CGTACATACC CAGGCAGATA CGTCCAGTCC 
13 0 81 CGATCCGTTA TTTAATCCTC TAAAAACTGG CGTTTGCCAA CTGGATAACG CGAACGTGAC 
13141 TGACGCGATC CTCAGCAGGG CAGGAGGGTC AATTGCTGAC TTTAC CGGGC ATCGGCAAAC 
13201 GGCGTTTCGC GAACTGGAAC GGGTGCTTAA TTTTCCGCAA TCAAACTTGT GCCTTAAACG 
13261 TGAGAAACAG GACGAAAGCT GTTCATTAAC GCAGGCATTA CCATCGGAAC TCAAGGTGAG 
13 3 21 CGCCGACAAT GTCTCATTAA CCGGTGCGGT AAGCCTCGCA TCAATGCTGA CGGAGATATT 
13381 TCTCCTGCAA CAAGCACAGG GAATG CCGGA GCCGGGGTGG GGAAGGATCA CCGATTCACA 
134 41 CCAGTGGAAC ACCTTGCTAA GTTTGCATAA CGCG CAATTT TATTTGCTAC AACGCACGCC 
13 501 AGAGGTTGCC CGCAGCCGCG CCACCCCGTT ATTAGATTTG ATCAAGACAG CGTTGACGCC 
13 561 CCATCCACCG CAAAAACAGG CGTATGGTGT G AC ATTACC C ACTTCAGTGC TGTTTATCGC 
13 621 CGGACACGAT ACTAATCTGG CAAATCTCGG CGGCGCACTG GAGCTCAACT GGACGCTTCC 
13 681 CGGTCAGCCG GATAACACGC CGCCAGGTGG TGAACTGGTG TTTGAACGCT GGCGTCGGCT 
13741 AAGCGATAAC AG CC AGTGGA TTCAGGTTTC GCTGGTCTTC CAGACTTTAC AGCAGATGCG 
13 801 TGATAAAACG CCGCTGTCAT TAAATACGCC GCCCGGAGAG GTGAAACTGA CCCTGGCAGG 
13 861 ATGTGAAGAG CGAAATGCGC AGGGCATGTG TTCGTTGGCA GGTTTTACGC AAATCGTGAA 
13 921 TGAAGCACGC ATACCCGCTT GCAGTTTGTA AGGTACCCGG GGAT CACAAC TTGCC CTCTG 

13 981 AAGAGGAAGA ACAGAAGGAT GCCACAACTC TCCTGCTGGC TACTCTCCAG TGGTTTCATC 

14 041 TTACTTCTGA TGGC ATTTCC CTCTAGAAAG TGCTACTATC ATC CAC AC AT TTCTACCTGA 
14101 GACCACCCAA AGGACCCTCC CAAATTCTCT TCCTCTCTGA GTAGTCTCCA CACCTGTTAC 
14161 CACCATCCCA GAATTAAAAT CCTAACTGCA CTCTGGCGTG TGACTTGCCT CAGTCCTTGC 
142 21 AATAAGAGTT GTTGGCAGTG CCAGGCGTGG TGGCGCACGC CTTTAATTCC AGCACTTGGG 
14 2 81 AGGCAGAGGC AGGCGGATTT CTGAGTTCGA GGCCAGCCTG GTCTACAGAG TGAGTTCCAG 
14 341 GACAGCCAGG G CT AT AC AG A GAAACCCTGT GTCGAAAAAC CAAAAAAAAA AAAAAAAGTT 
14401 GTTGG C AG AG TGTGGGTTAT ATAC CAGGTG GAGATTTCAA ATGAGTGGCT GAAGCTGTAG 

144 61 CCAGAAGGAA CTTAGAGGAT AGCTCATAAC TTAAAAAGAA ATGTAGAGAG TAGCAGAAAC 
14 521 ATTGAGAGAG TGGGCACACA GCCACTGTGT GAATGTGGCA GAACACAATC CAGCCAGCTA 

145 81 TACATGCATA AGTGTATATT GGCGCCATCC TGACTGATGA GACACAGGAA AACAGATAGA 
14641 CGGGGTTAGG TGGCCATGGC CTTTCCTGCC TGCCTCTTCC TAAGGGTCAT CTCAAGACCT 
147 01 TATGCTCTCT TAACTCTTCC ATTGCTACTT AGCTTCTAGA TATCACCTCC AGATTAGTCT 
147 61 CCTTGGGTAC ATCAGTGATC CTGGTGATAT CCAGGGCTTC CTGATTCCAT CTTTGTCATA 
14 821 GAGGCTGCAA CTAAAGAGGT CTTCTTAATA CTTCACACCC TGATGCCAAA AGGAAGACAC 
14B81 AGAAGTTCAC AGAGGTGAAG TGATTCATGT AGGACATACA GTGAGCAAGC ATCAGGGTCC 
14 941 GG ATTATCTG ACTCTACTCT AACTTTTATG TAAATGTGCT TTATGCCATT AACACTGTCA 
150 01 TTCCTGTGCT TCAGCTCTGG GAGACTCCCA AGCACTCTTA GGCACAAGCC ACAATTAAGG 
150 61 G ACT CTG AC A CTCTGCATTG ATTAATTAGC ATGGTGGTCT CTATGTTTCC AGATTCATGA 
15121 TTGTTTCACT TT C CAT AT AG GCTATGAAGG GTGTGAGGAA ATTTTTTGGG GACAGAATTG 
15181 GAGGCAATCC ACCTCTCTCA GGAAGCCTCT ATCTGGAAAA GCTTACAACT CAGGGACA.GT 
152 4 1 AACTGTAGGC CCAGTCCTTG GTGTCCAAAA TGGGTTTTAT GGTTTGAATC-^GCAAAGCCT 
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Figure 23 (continued): 

153 01 TCCATGTGCT CAAAGGTTTG AACATGGAGC CTCCTCCTGG TAACACTGTA TTGGAGGCTT 
15361 TTGAGACTGG ATG CTCTTTG GTCCCATGTT TTGCTACATC ATCTGTCAAG ATATGACCCA 

154 21 GGCATGCTAC CAGCTACCAC AG ACT ATG CC TCTCCAGCTT TCATGTTCTC CCCACCATGA 
154 81 TAGACTTGTA TCTCCTAAAA ATGGAATCAA AGCAAACTTT TCCTGCATTA AGTTTTTTTT 
15541 TTTCTGTTAA GTGTTTGGTC ACAGGGACAA GAAAACACTC AATACAGATA ATT AG T AC CA 
15601 GAGTTGAGGT TCATTGCTCT AGCAAGTTGG ATCAAATTTT TAGGGCTTTG GAACTGATTT 
15 661 ATAAGAGACA TGTAGAAGAG TCTGAAGCTG TGGGCTACAG AAGTGTCACC AGTTTTTAAG 
15721 AATAGTTTAA TACAC CATGG GAATTGTGAA AATCAGAATG CTCACACAAA GG C AG AC AGG 
157 81 AAAACGTGAG CATGTGG CGT GTGAGAGGGC ATAAGAAGGA ACCTAGGGGG AAATGAGCTA 
15841 GAAGCCATTC GGCTACGTTA GGGAACGTGT GTGGCTGTGC TTGGCCCATG CCCTGGCAAT 

15 901 CTGAATGAGG CCAAATTTTA AAGGAGTGGA CTAACTCGAT TGTCAGAGAA AATATCAAGA 
15961 CAGACCACCA CTCAGGCTAT GCCGTGTTTG TGACCGACCA GCTACTCTTA GCCAGCTCTA 
16021 TTGTGAAATT CCAGAGCAAT TATCAGAGCA TGAAGATACA TACAGTTTAG TGAAGIAAGG 
16081 GGTGTGGGTC CCTAAGTGGA TGGTGCATAA ATCTATGTAG GTGATGCCTA AGTGACACTT 
16141 GATAATCCAA AATATCAGCA ATGTGGAATG TCTTCCAAGG AG AC CTG TAG ACACACATTT 
16201 TAGAACTTTG CTCATGGCTG TAATAAATAG CTAGCTAGAA ATCATTTCCT GAAGAGGTTA 
16261 GTCTGAGTTA CGGTTCCAGG GCAAACATTC AGTGATGGCA AGGAAGGCAT TGCAGTCAGG 
16321 AGCCAAAGGT CAGCTGGTCA CATTG CATCA AGAGTAGAGA GTCAGAGTGT GAGTAGAAAG 
163 81 AGGATACAGG TTATAAAACC TCACTGTCCA CTCTCAG CAA TCCATTTTCT CCTAAAAGGC 
16441 TTTACCTTCT AAAGATTTTA GTCTTCAAAA CCAGTAC CAG TAGCCTGGGA ACAAAAGTTG 
16501 AAACAAATGA GCCTTTGTGG GGCATTTCAC ACTTAAAACA GGGCATCACC T AGGAGGAG C 
16561 CCTGTGTGCA GTAGGAAGTG TGGCCTCTGT GTCAGGAATG CTCAGGCTAA TAAGGGGTCC 
16621 TCTATCTGAG GGACCCTATG AAGATTCAAC AAGTAGTTGT GAGAATTCCC TGTAAATGGA 
16681 TGCTACCAAT TTGACATTTG TAGACCTGCT ATTGTGTGCT TCTTTATTGG GCTCTCCCAT 
16741 CTCCCAACTT TCCAACCCAT ATTCCACATT AATCCCTTCC ACCACCATGC AACACTAGGT 

16 801 AGGAGAGAAG GAAGGTTAGA AGAGAAAGTG GGTATAGATC TATTTAGACT ACTTCCTGCT 
16 8 61 GATTAGGGGC AAGTCCAATC GTCATTGTCA GGATACCTCC AACCAGCAAC CAG CAAACCA 
16921 GCAAATCAGA AACAGCAAAA GCAGCCAACA AGGCAGCACT AACCAGCAGG ATTG GGGTCG 

16 981 GTAGCGTGGG AGCAGTCACT ACTGGTCTTC TCATGGCTTT GGCATTAATA CTCTCTCAAG 

17 041 AAATTCCGTA ATTTTTTCCC CACCACCTGA AATTCCGTAA TTTTAAATGC AAACTATCTA 
17101 CAG CTGG CAA AAATCACATC TCTCCTAGAG CACAAGACAA ATCATAGTTA CTGGCTATTT 
17161 GCAATCTGAA GCATCTCAAT ATCCCACACC TGGGATTAAA ACAAAAACAT ATTCACATCA 
17 221 CATAACTGTT XXXXXTXXC C AATTTTTTAT TAGGTATTTT CTTTATTTAC ATTTCAAATG 
17281 CTATCCCGAA AGTCCCCTAT ACCCTCCCAC CTCCCTGCTC CCCTACACAC CCACTCCCAC 
17 341 TTTTTG AC C C TGGAGTTCCC CGGTACTGGG GCATATAAAG TTTGCAAGAC CAAGGGGCCT 
17401 CTCTTCCCAG TGATGGCCGA CTAAGCCATC TTCTGCTACA TATGCAGATA GAGACACGAG 
174 61 CTCTGGGGGT ACTAGTTAGT TCATATTGTT GTTCCACCTA TAGGGTCGCA GACCCCTTCA 
17521 GCTCCTTGGG TACTTTGTCT AGCTCCTCCA CTGGGGGCTC TGTGTTTTAT CTAATAGATG 
17581 ACTG TGAGC A TCCACTTCTG TATTTGACAG GCACTGGCCT AGCGTCACAT GAGCCAGCTA 
17 641 TATCAGGGTC CTTTCAGCAA AACCTTGCTG GCATGTG CAA TAGTGTCTGC GTTTGGTGGT 
17701 TGATTATGGG ATGGATCCAC TAGTTCTAGA GC 
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